[jira] [Updated] (HDFS-11898) DFSClient#isHedgedReadsEnabled() should be per client flag

2017-06-01 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-11898:
-
Attachment: HDFS-11898-02.patch

Attached the patch to add the flag per client.

May be discussion of whether to make threadpool static or not can be taken into 
HDFS-11900 ?

> DFSClient#isHedgedReadsEnabled() should be per client flag 
> ---
>
> Key: HDFS-11898
> URL: https://issues.apache.org/jira/browse/HDFS-11898
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Attachments: HDFS-11898-01.patch, HDFS-11898-02.patch
>
>
> DFSClient#isHedgedReadsEnabled() returns value based on static 
> {{HEDGED_READ_THREAD_POOL}}. 
> Hence if any of the client initialized this in JVM, all remaining client 
> reads will be going through hedged read itself.
> This flag should be per client value.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11822) Block Storage: Fix TestCBlockCLI, failing because of " Address already in use"

2017-06-01 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-11822:
-
Attachment: (was: HDFS-11822-HDFS-7240.003.patch)

> Block Storage: Fix TestCBlockCLI, failing because of " Address already in use"
> --
>
> Key: HDFS-11822
> URL: https://issues.apache.org/jira/browse/HDFS-11822
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11822-HDFS-7240.001.patch, 
> HDFS-11822-HDFS-7240.002.patch, HDFS-11822-HDFS-7240.003.patch
>
>
> TestCBlockCLI is failing because of bind error.
> https://builds.apache.org/job/PreCommit-HDFS-Build/19429/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
> {code}
> org.apache.hadoop.cblock.TestCBlockCLI  Time elapsed: 0.668 sec  <<< ERROR!
> java.net.BindException: Problem binding to [0.0.0.0:9810] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:433)
>   at sun.nio.ch.Net.bind(Net.java:425)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:543)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:1033)
>   at org.apache.hadoop.ipc.Server.(Server.java:2791)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:960)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:420)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:341)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:802)
>   at 
> org.apache.hadoop.cblock.CBlockManager.startRpcServer(CBlockManager.java:215)
>   at org.apache.hadoop.cblock.CBlockManager.(CBlockManager.java:131)
>   at org.apache.hadoop.cblock.TestCBlockCLI.setup(TestCBlockCLI.java:57)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11822) Block Storage: Fix TestCBlockCLI, failing because of " Address already in use"

2017-06-01 Thread Mukul Kumar Singh (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mukul Kumar Singh updated HDFS-11822:
-
Attachment: HDFS-11822-HDFS-7240.003.patch

> Block Storage: Fix TestCBlockCLI, failing because of " Address already in use"
> --
>
> Key: HDFS-11822
> URL: https://issues.apache.org/jira/browse/HDFS-11822
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11822-HDFS-7240.001.patch, 
> HDFS-11822-HDFS-7240.002.patch, HDFS-11822-HDFS-7240.003.patch
>
>
> TestCBlockCLI is failing because of bind error.
> https://builds.apache.org/job/PreCommit-HDFS-Build/19429/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
> {code}
> org.apache.hadoop.cblock.TestCBlockCLI  Time elapsed: 0.668 sec  <<< ERROR!
> java.net.BindException: Problem binding to [0.0.0.0:9810] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:433)
>   at sun.nio.ch.Net.bind(Net.java:425)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:543)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:1033)
>   at org.apache.hadoop.ipc.Server.(Server.java:2791)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:960)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:420)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:341)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:802)
>   at 
> org.apache.hadoop.cblock.CBlockManager.startRpcServer(CBlockManager.java:215)
>   at org.apache.hadoop.cblock.CBlockManager.(CBlockManager.java:131)
>   at org.apache.hadoop.cblock.TestCBlockCLI.setup(TestCBlockCLI.java:57)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-06-01 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034160#comment-16034160
 ] 

Vinayakumar B commented on HDFS-5042:
-

Thanks a lot [~kihwal] for reviews and commit.
Thanks everyone for the discussion and pushing this long pending issue to 
closure.

> Completed files lost after power failure
> 
>
> Key: HDFS-5042
> URL: https://issues.apache.org/jira/browse/HDFS-5042
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: ext3 on CentOS 5.7 (kernel 2.6.18-274.el5)
>Reporter: Dave Latham
>Assignee: Vinayakumar B
>Priority: Critical
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HDFS-5042-01.patch, HDFS-5042-02.patch, 
> HDFS-5042-03.patch, HDFS-5042-04.patch, HDFS-5042-05-branch-2.patch, 
> HDFS-5042-05.patch, HDFS-5042-branch-2-01.patch, HDFS-5042-branch-2-05.patch, 
> HDFS-5042-branch-2.7-05.patch, HDFS-5042-branch-2.7-06.patch, 
> HDFS-5042-branch-2.8-05.patch, HDFS-5042-branch-2.8-06.patch
>
>
> We suffered a cluster wide power failure after which HDFS lost data that it 
> had acknowledged as closed and complete.
> The client was HBase which compacted a set of HFiles into a new HFile, then 
> after closing the file successfully, deleted the previous versions of the 
> file.  The cluster then lost power, and when brought back up the newly 
> created file was marked CORRUPT.
> Based on reading the logs it looks like the replicas were created by the 
> DataNodes in the 'blocksBeingWritten' directory.  Then when the file was 
> closed they were moved to the 'current' directory.  After the power cycle 
> those replicas were again in the blocksBeingWritten directory of the 
> underlying file system (ext3).  When those DataNodes reported in to the 
> NameNode it deleted those replicas and lost the file.
> Some possible fixes could be having the DataNode fsync the directory(s) after 
> moving the block from blocksBeingWritten to current to ensure the rename is 
> durable or having the NameNode accept replicas from blocksBeingWritten under 
> certain circumstances.
> Log snippets from RS (RegionServer), NN (NameNode), DN (DataNode):
> {noformat}
> RS 2013-06-29 11:16:06,812 DEBUG org.apache.hadoop.hbase.util.FSUtils: 
> Creating 
> file=hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  with permission=rwxrwxrwx
> NN 2013-06-29 11:16:06,830 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.allocateBlock: 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c.
>  blk_1395839728632046111_357084589
> DN 2013-06-29 11:16:06,832 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block 
> blk_1395839728632046111_357084589 src: /10.0.5.237:14327 dest: 
> /10.0.5.237:50010
> NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.6.1:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.6.24:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.5.237:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> DN 2013-06-29 11:16:11,385 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Received block 
> blk_1395839728632046111_357084589 of size 25418340 from /10.0.5.237:14327
> DN 2013-06-29 11:16:11,385 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for block 
> blk_1395839728632046111_357084589 terminating
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: Removing 
> lease on  file 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  from client DFSClient_hb_rs_hs745,60020,1372470111932
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.completeFile: file 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  is closed by DFSClient_hb_rs_hs745,60020,1372470111932
> RS 2013-06-29 11:16:11,393 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Renaming compacted file at 
> hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  to 
> hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/n/6e0cc30af6e64e56ba5a539fdf159c4c
> RS 2013-06-29 11:16:11,505 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Completed major compaction of 7 file(s) in n of 
> users-6,\x12\xBDp\xA3,1359426311784.b5b0820cde759ae68e333b2f4015bb7e. into 
> 

[jira] [Commented] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-06-01 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034156#comment-16034156
 ] 

Vinayakumar B commented on HDFS-11856:
--

Thanks a lot [~kihwal] for reviews and commit.

> Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline 
> updates
> --
>
> Key: HDFS-11856
> URL: https://issues.apache.org/jira/browse/HDFS-11856
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, rolling upgrades
>Affects Versions: 2.7.3
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HDFS-11856-01.patch, HDFS-11856-02.branch-2.patch, 
> HDFS-11856-02.patch, HDFS-11856-branch-2-02.patch, 
> HDFS-11856-branch-2.7-02.patch, HDFS-11856-branch-2.8-02.patch
>
>
> During rolling upgrade if the DN gets restarted, then it will send special 
> OOB_RESTART status to all streams opened for write.
> 1. Local clients will wait for 30 seconds to datanode to come back.
> 2. Remote clients will consider these nodes as bad nodes and continue with 
> pipeline recoveries and write. These restarted nodes will be considered as 
> bad, and will be excluded for lifetime of stream.
> In case of small cluster, where total nodes itself is 3, each time a remote 
> node restarts for upgrade, it will be excluded.
> So a stream writing to 3 nodes initial, will end-up writing to only one node 
> at the end, there are no other nodes to replace.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11359) DFSAdmin report command supports displaying maintenance state datanodes

2017-06-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034148#comment-16034148
 ] 

Hudson commented on HDFS-11359:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11817 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11817/])
HDFS-11359. DFSAdmin report command supports displaying maintenance (yqlin: rev 
8d9084eb62f4593d4dfeb618abacf6ae89019109)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/proto/ClientNamenodeProtocol.proto
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSCommands.md
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelperClient.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMaintenanceState.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/HdfsConstants.java


> DFSAdmin report command supports displaying maintenance state datanodes
> ---
>
> Key: HDFS-11359
> URL: https://issues.apache.org/jira/browse/HDFS-11359
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11359.001.patch, HDFS-11359.002.patch, 
> HDFS-11359.003.patch, HDFS-11359.004.patch, HDFS-11359.005.patch, 
> HDFS-11359-branch-2.patch
>
>
> The datanode's maintenance state info can be showed in webUI/JMX. But it 
> can't be displayed via CLI. This JIRA will improve on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-06-01 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034143#comment-16034143
 ] 

John Zhuge commented on HDFS-11856:
---

Thanks [~vinayrpet].

> Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline 
> updates
> --
>
> Key: HDFS-11856
> URL: https://issues.apache.org/jira/browse/HDFS-11856
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, rolling upgrades
>Affects Versions: 2.7.3
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HDFS-11856-01.patch, HDFS-11856-02.branch-2.patch, 
> HDFS-11856-02.patch, HDFS-11856-branch-2-02.patch, 
> HDFS-11856-branch-2.7-02.patch, HDFS-11856-branch-2.8-02.patch
>
>
> During rolling upgrade if the DN gets restarted, then it will send special 
> OOB_RESTART status to all streams opened for write.
> 1. Local clients will wait for 30 seconds to datanode to come back.
> 2. Remote clients will consider these nodes as bad nodes and continue with 
> pipeline recoveries and write. These restarted nodes will be considered as 
> bad, and will be excluded for lifetime of stream.
> In case of small cluster, where total nodes itself is 3, each time a remote 
> node restarts for upgrade, it will be excluded.
> So a stream writing to 3 nodes initial, will end-up writing to only one node 
> at the end, there are no other nodes to replace.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-06-01 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034142#comment-16034142
 ] 

Vinayakumar B commented on HDFS-11856:
--

bq. Will this patch help clusters with more than 3 DNs?
Yes, this should work there as well. Only prerequisite is, OOB_RESTART flags 
should reach to client from upgrading node.


> Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline 
> updates
> --
>
> Key: HDFS-11856
> URL: https://issues.apache.org/jira/browse/HDFS-11856
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, rolling upgrades
>Affects Versions: 2.7.3
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HDFS-11856-01.patch, HDFS-11856-02.branch-2.patch, 
> HDFS-11856-02.patch, HDFS-11856-branch-2-02.patch, 
> HDFS-11856-branch-2.7-02.patch, HDFS-11856-branch-2.8-02.patch
>
>
> During rolling upgrade if the DN gets restarted, then it will send special 
> OOB_RESTART status to all streams opened for write.
> 1. Local clients will wait for 30 seconds to datanode to come back.
> 2. Remote clients will consider these nodes as bad nodes and continue with 
> pipeline recoveries and write. These restarted nodes will be considered as 
> bad, and will be excluded for lifetime of stream.
> In case of small cluster, where total nodes itself is 3, each time a remote 
> node restarts for upgrade, it will be excluded.
> So a stream writing to 3 nodes initial, will end-up writing to only one node 
> at the end, there are no other nodes to replace.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11359) DFSAdmin report command supports displaying maintenance state datanodes

2017-06-01 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11359:
-
Priority: Major  (was: Minor)

> DFSAdmin report command supports displaying maintenance state datanodes
> ---
>
> Key: HDFS-11359
> URL: https://issues.apache.org/jira/browse/HDFS-11359
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11359.001.patch, HDFS-11359.002.patch, 
> HDFS-11359.003.patch, HDFS-11359.004.patch, HDFS-11359.005.patch, 
> HDFS-11359-branch-2.patch
>
>
> The datanode's maintenance state info can be showed in webUI/JMX. But it 
> can't be displayed via CLI. This JIRA will improve on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11359) DFSAdmin report command supports displaying maintenance state datanodes

2017-06-01 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11359:
-
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha4
   2.9.0

> DFSAdmin report command supports displaying maintenance state datanodes
> ---
>
> Key: HDFS-11359
> URL: https://issues.apache.org/jira/browse/HDFS-11359
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11359.001.patch, HDFS-11359.002.patch, 
> HDFS-11359.003.patch, HDFS-11359.004.patch, HDFS-11359.005.patch, 
> HDFS-11359-branch-2.patch
>
>
> The datanode's maintenance state info can be showed in webUI/JMX. But it 
> can't be displayed via CLI. This JIRA will improve on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11359) DFSAdmin report command supports displaying maintenance state datanodes

2017-06-01 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11359:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> DFSAdmin report command supports displaying maintenance state datanodes
> ---
>
> Key: HDFS-11359
> URL: https://issues.apache.org/jira/browse/HDFS-11359
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11359.001.patch, HDFS-11359.002.patch, 
> HDFS-11359.003.patch, HDFS-11359.004.patch, HDFS-11359.005.patch, 
> HDFS-11359-branch-2.patch
>
>
> The datanode's maintenance state info can be showed in webUI/JMX. But it 
> can't be displayed via CLI. This JIRA will improve on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11359) DFSAdmin report command supports displaying maintenance state datanodes

2017-06-01 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034136#comment-16034136
 ] 

Yiqun Lin commented on HDFS-11359:
--

Have verified the unit test in my local, it ran okay. Committed this to trunk 
and branch-2.
Thanks everyone for discussions and providing reviews, :)!

> DFSAdmin report command supports displaying maintenance state datanodes
> ---
>
> Key: HDFS-11359
> URL: https://issues.apache.org/jira/browse/HDFS-11359
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11359.001.patch, HDFS-11359.002.patch, 
> HDFS-11359.003.patch, HDFS-11359.004.patch, HDFS-11359.005.patch, 
> HDFS-11359-branch-2.patch
>
>
> The datanode's maintenance state info can be showed in webUI/JMX. But it 
> can't be displayed via CLI. This JIRA will improve on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11779) Ozone: KSM: add listBuckets

2017-06-01 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034131#comment-16034131
 ] 

Weiwei Yang commented on HDFS-11779:


UT failure in class {{TestKeySpaceManager}} was not related to this patch, I 
have created HDFS-11913 to fix that.

> Ozone: KSM: add listBuckets
> ---
>
> Key: HDFS-11779
> URL: https://issues.apache.org/jira/browse/HDFS-11779
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Weiwei Yang
> Attachments: HDFS-11779-HDFS-7240.001.patch, 
> HDFS-11779-HDFS-7240.002.patch, HDFS-11779-HDFS-7240.003.patch, 
> HDFS-11779-HDFS-7240.004.patch, HDFS-11779-HDFS-7240.005.patch, 
> HDFS-11779-HDFS-7240.006.patch, HDFS-11779-HDFS-7240.007.patch, 
> HDFS-11779-HDFS-7240.008.patch, HDFS-11779-HDFS-7240.009.patch, 
> HDFS-11779-HDFS-7240.010.patch, HDFS-11779-HDFS-7240.011.patch
>
>
> Lists buckets of a given volume. Similar to listVolumes, paging supported via 
> prevKey, prefix and maxKeys.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11913) Ozone: TestKeySpaceManager#testDeleteVolume fails

2017-06-01 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-11913:
---
Description: 
HDFS-11774 introduces an UT failure, {{TestKeySpaceManager#testDeleteVolume}}, 
error as below

{noformat}
java.util.NoSuchElementException
 at 
org.fusesource.leveldbjni.internal.JniDBIterator.peekNext(JniDBIterator.java:84)
 at org.fusesource.leveldbjni.internal.JniDBIterator.next(JniDBIterator.java:98)
 at org.fusesource.leveldbjni.internal.JniDBIterator.next(JniDBIterator.java:45)
 at 
org.apache.hadoop.ozone.ksm.MetadataManagerImpl.isVolumeEmpty(MetadataManagerImpl.java:221)
 at 
org.apache.hadoop.ozone.ksm.VolumeManagerImpl.deleteVolume(VolumeManagerImpl.java:294)
 at 
org.apache.hadoop.ozone.ksm.KeySpaceManager.deleteVolume(KeySpaceManager.java:340)
 at 
org.apache.hadoop.ozone.protocolPB.KeySpaceManagerProtocolServerSideTranslatorPB.deleteVolume(KeySpaceManagerProtocolServerSideTranslatorPB.java:200)
 at 
org.apache.hadoop.ozone.protocol.proto.KeySpaceManagerProtocolProtos$KeySpaceManagerService$2.callBlockingMethod(KeySpaceManagerProtocolProtos.java:22742)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2659)
{noformat}

this is caused by a buggy code in {{MetadataManagerImpl#isVolumeEmpty}}, there 
are 2 issues need to be fixed
# Iterate next element will throw this exception if it doesn't have next. This 
always fail when a volume is empty.
# The code was checking if the first bucket name start with "/volume_name", 
this will return a wrong value if I have several empty volumes with same 
prefix, e.g "/volA/", "/volAA/". Such case {{isVolumeEmpty}} will return false 
as the next element from "/volA/" is not a bucket, it's another volume 
"/volAA/" but matches the prefix.

For now an empty volume with name "/volA/" is probably not valid, but if we 
make sure our bucket key starts with "/volA/" instead of just "/volA" is a good 
idea to leave us away from weird problems.

  was:
HDFS-11774 introduces an UT failure, {{TestKeySpaceManager#testDeleteVolume}}, 
error as below

{noformat}
java.util.NoSuchElementException
 at 
org.fusesource.leveldbjni.internal.JniDBIterator.peekNext(JniDBIterator.java:84)
 at org.fusesource.leveldbjni.internal.JniDBIterator.next(JniDBIterator.java:98)
 at org.fusesource.leveldbjni.internal.JniDBIterator.next(JniDBIterator.java:45)
 at 
org.apache.hadoop.ozone.ksm.MetadataManagerImpl.isVolumeEmpty(MetadataManagerImpl.java:221)
 at 
org.apache.hadoop.ozone.ksm.VolumeManagerImpl.deleteVolume(VolumeManagerImpl.java:294)
 at 
org.apache.hadoop.ozone.ksm.KeySpaceManager.deleteVolume(KeySpaceManager.java:340)
 at 
org.apache.hadoop.ozone.protocolPB.KeySpaceManagerProtocolServerSideTranslatorPB.deleteVolume(KeySpaceManagerProtocolServerSideTranslatorPB.java:200)
 at 
org.apache.hadoop.ozone.protocol.proto.KeySpaceManagerProtocolProtos$KeySpaceManagerService$2.callBlockingMethod(KeySpaceManagerProtocolProtos.java:22742)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2659)
{noformat}

this is caused by a buggy code in {{MetadataManagerImpl#isVolumeEmpty}}, there 
are 2 issues need to be fixed
# Iterate next element will throw this exception if it doesn't have next. This 
always fail when a volume is empty.
# The code was checking if the first bucket name start with "/volume_name", 
this will return a wrong value if I have several empty volumes with same 
prefix, e.g "/volA/", "/volAA/". Such case {{isVolumeEmpty}} will return false 
as the next element from "/volA/" is not a bucket, it's another volume 
"/volAA/" but matches the prefix.


> Ozone: TestKeySpaceManager#testDeleteVolume fails
> -
>
> Key: HDFS-11913
> URL: https://issues.apache.org/jira/browse/HDFS-11913
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ozone
>Reporter: Weiwei Yang
>

[jira] [Updated] (HDFS-11913) Ozone: TestKeySpaceManager#testDeleteVolume fails

2017-06-01 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-11913:
---
Status: Patch Available  (was: Open)

> Ozone: TestKeySpaceManager#testDeleteVolume fails
> -
>
> Key: HDFS-11913
> URL: https://issues.apache.org/jira/browse/HDFS-11913
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-11913-HDFS-7240.001.patch
>
>
> HDFS-11774 introduces an UT failure, 
> {{TestKeySpaceManager#testDeleteVolume}}, error as below
> {noformat}
> java.util.NoSuchElementException
>  at 
> org.fusesource.leveldbjni.internal.JniDBIterator.peekNext(JniDBIterator.java:84)
>  at 
> org.fusesource.leveldbjni.internal.JniDBIterator.next(JniDBIterator.java:98)
>  at 
> org.fusesource.leveldbjni.internal.JniDBIterator.next(JniDBIterator.java:45)
>  at 
> org.apache.hadoop.ozone.ksm.MetadataManagerImpl.isVolumeEmpty(MetadataManagerImpl.java:221)
>  at 
> org.apache.hadoop.ozone.ksm.VolumeManagerImpl.deleteVolume(VolumeManagerImpl.java:294)
>  at 
> org.apache.hadoop.ozone.ksm.KeySpaceManager.deleteVolume(KeySpaceManager.java:340)
>  at 
> org.apache.hadoop.ozone.protocolPB.KeySpaceManagerProtocolServerSideTranslatorPB.deleteVolume(KeySpaceManagerProtocolServerSideTranslatorPB.java:200)
>  at 
> org.apache.hadoop.ozone.protocol.proto.KeySpaceManagerProtocolProtos$KeySpaceManagerService$2.callBlockingMethod(KeySpaceManagerProtocolProtos.java:22742)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2659)
> {noformat}
> this is caused by a buggy code in {{MetadataManagerImpl#isVolumeEmpty}}, 
> there are 2 issues need to be fixed
> # Iterate next element will throw this exception if it doesn't have next. 
> This always fail when a volume is empty.
> # The code was checking if the first bucket name start with "/volume_name", 
> this will return a wrong value if I have several empty volumes with same 
> prefix, e.g "/volA/", "/volAA/". Such case {{isVolumeEmpty}} will return 
> false as the next element from "/volA/" is not a bucket, it's another volume 
> "/volAA/" but matches the prefix.
> For now an empty volume with name "/volA/" is probably not valid, but if we 
> make sure our bucket key starts with "/volA/" instead of just "/volA" is a 
> good idea to leave us away from weird problems.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11893) Fix TestDFSShell.testMoveWithTargetPortEmpty failure.

2017-06-01 Thread Akira Ajisaka (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-11893:
-
Fix Version/s: (was: 2.8.2)

> Fix TestDFSShell.testMoveWithTargetPortEmpty failure.
> -
>
> Key: HDFS-11893
> URL: https://issues.apache.org/jira/browse/HDFS-11893
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.4
>Reporter: Konstantin Shvachko
>Assignee: Brahma Reddy Battula
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.1
>
> Attachments: HDFS-11893-001.patch, HDFS-11893-002.patch, 
> org.apache.hadoop.hdfs.TestDFSShell-output.txt, 
> org.apache.hadoop.hdfs.TestDFSShell.txt
>
>
> {{TestDFSShell.testMoveWithTargetPortEmpty()}} is consistently failing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11913) Ozone: TestKeySpaceManager#testDeleteVolume fails

2017-06-01 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-11913:
---
Attachment: HDFS-11913-HDFS-7240.001.patch

> Ozone: TestKeySpaceManager#testDeleteVolume fails
> -
>
> Key: HDFS-11913
> URL: https://issues.apache.org/jira/browse/HDFS-11913
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ozone
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-11913-HDFS-7240.001.patch
>
>
> HDFS-11774 introduces an UT failure, 
> {{TestKeySpaceManager#testDeleteVolume}}, error as below
> {noformat}
> java.util.NoSuchElementException
>  at 
> org.fusesource.leveldbjni.internal.JniDBIterator.peekNext(JniDBIterator.java:84)
>  at 
> org.fusesource.leveldbjni.internal.JniDBIterator.next(JniDBIterator.java:98)
>  at 
> org.fusesource.leveldbjni.internal.JniDBIterator.next(JniDBIterator.java:45)
>  at 
> org.apache.hadoop.ozone.ksm.MetadataManagerImpl.isVolumeEmpty(MetadataManagerImpl.java:221)
>  at 
> org.apache.hadoop.ozone.ksm.VolumeManagerImpl.deleteVolume(VolumeManagerImpl.java:294)
>  at 
> org.apache.hadoop.ozone.ksm.KeySpaceManager.deleteVolume(KeySpaceManager.java:340)
>  at 
> org.apache.hadoop.ozone.protocolPB.KeySpaceManagerProtocolServerSideTranslatorPB.deleteVolume(KeySpaceManagerProtocolServerSideTranslatorPB.java:200)
>  at 
> org.apache.hadoop.ozone.protocol.proto.KeySpaceManagerProtocolProtos$KeySpaceManagerService$2.callBlockingMethod(KeySpaceManagerProtocolProtos.java:22742)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
>  at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
>  at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
>  at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2659)
> {noformat}
> this is caused by a buggy code in {{MetadataManagerImpl#isVolumeEmpty}}, 
> there are 2 issues need to be fixed
> # Iterate next element will throw this exception if it doesn't have next. 
> This always fail when a volume is empty.
> # The code was checking if the first bucket name start with "/volume_name", 
> this will return a wrong value if I have several empty volumes with same 
> prefix, e.g "/volA/", "/volAA/". Such case {{isVolumeEmpty}} will return 
> false as the next element from "/volA/" is not a bucket, it's another volume 
> "/volAA/" but matches the prefix.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11913) Ozone: TestKeySpaceManager#testDeleteVolume fails

2017-06-01 Thread Weiwei Yang (JIRA)
Weiwei Yang created HDFS-11913:
--

 Summary: Ozone: TestKeySpaceManager#testDeleteVolume fails
 Key: HDFS-11913
 URL: https://issues.apache.org/jira/browse/HDFS-11913
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ozone
Reporter: Weiwei Yang
Assignee: Weiwei Yang


HDFS-11774 introduces an UT failure, {{TestKeySpaceManager#testDeleteVolume}}, 
error as below

{noformat}
java.util.NoSuchElementException
 at 
org.fusesource.leveldbjni.internal.JniDBIterator.peekNext(JniDBIterator.java:84)
 at org.fusesource.leveldbjni.internal.JniDBIterator.next(JniDBIterator.java:98)
 at org.fusesource.leveldbjni.internal.JniDBIterator.next(JniDBIterator.java:45)
 at 
org.apache.hadoop.ozone.ksm.MetadataManagerImpl.isVolumeEmpty(MetadataManagerImpl.java:221)
 at 
org.apache.hadoop.ozone.ksm.VolumeManagerImpl.deleteVolume(VolumeManagerImpl.java:294)
 at 
org.apache.hadoop.ozone.ksm.KeySpaceManager.deleteVolume(KeySpaceManager.java:340)
 at 
org.apache.hadoop.ozone.protocolPB.KeySpaceManagerProtocolServerSideTranslatorPB.deleteVolume(KeySpaceManagerProtocolServerSideTranslatorPB.java:200)
 at 
org.apache.hadoop.ozone.protocol.proto.KeySpaceManagerProtocolProtos$KeySpaceManagerService$2.callBlockingMethod(KeySpaceManagerProtocolProtos.java:22742)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:867)
 at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:813)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2659)
{noformat}

this is caused by a buggy code in {{MetadataManagerImpl#isVolumeEmpty}}, there 
are 2 issues need to be fixed
# Iterate next element will throw this exception if it doesn't have next. This 
always fail when a volume is empty.
# The code was checking if the first bucket name start with "/volume_name", 
this will return a wrong value if I have several empty volumes with same 
prefix, e.g "/volA/", "/volAA/". Such case {{isVolumeEmpty}} will return false 
as the next element from "/volA/" is not a bucket, it's another volume 
"/volAA/" but matches the prefix.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11779) Ozone: KSM: add listBuckets

2017-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034079#comment-16034079
 ] 

Hadoop QA commented on HDFS-11779:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
32s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
33s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
29s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
35s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 2 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
51s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-7240 has 10 
extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
37s{color} | {color:green} HDFS-7240 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 1 
unchanged - 1 fixed = 1 total (was 2) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
14s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 20s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}103m 53s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.ozone.ksm.TestKeySpaceManager |
|   | hadoop.hdfs.server.balancer.TestBalancerRPCDelay |
|   | hadoop.cblock.TestCBlockServer |
| Timed out junit tests | 
org.apache.hadoop.ozone.container.ozoneimpl.TestRatisManager |
|   | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure170 |
|   | org.apache.hadoop.hdfs.shortcircuit.TestShortCircuitCache |
|   | org.apache.hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure 
|
|   | org.apache.hadoop.hdfs.TestGetBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |

[jira] [Commented] (HDFS-11359) DFSAdmin report command supports displaying maintenance state datanodes

2017-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034072#comment-16034072
 ] 

Hadoop QA commented on HDFS-11359:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
14s{color} | {color:red} Docker failed to build yetus/hadoop:8515d35. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-11359 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870910/HDFS-11359-branch-2.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19738/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DFSAdmin report command supports displaying maintenance state datanodes
> ---
>
> Key: HDFS-11359
> URL: https://issues.apache.org/jira/browse/HDFS-11359
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11359.001.patch, HDFS-11359.002.patch, 
> HDFS-11359.003.patch, HDFS-11359.004.patch, HDFS-11359.005.patch, 
> HDFS-11359-branch-2.patch
>
>
> The datanode's maintenance state info can be showed in webUI/JMX. But it 
> can't be displayed via CLI. This JIRA will improve on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11359) DFSAdmin report command supports displaying maintenance state datanodes

2017-06-01 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11359:
-
Attachment: (was: HDFS-11359-branch-2.patch)

> DFSAdmin report command supports displaying maintenance state datanodes
> ---
>
> Key: HDFS-11359
> URL: https://issues.apache.org/jira/browse/HDFS-11359
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11359.001.patch, HDFS-11359.002.patch, 
> HDFS-11359.003.patch, HDFS-11359.004.patch, HDFS-11359.005.patch, 
> HDFS-11359-branch-2.patch
>
>
> The datanode's maintenance state info can be showed in webUI/JMX. But it 
> can't be displayed via CLI. This JIRA will improve on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11359) DFSAdmin report command supports displaying maintenance state datanodes

2017-06-01 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11359:
-
Attachment: HDFS-11359-branch-2.patch

> DFSAdmin report command supports displaying maintenance state datanodes
> ---
>
> Key: HDFS-11359
> URL: https://issues.apache.org/jira/browse/HDFS-11359
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11359.001.patch, HDFS-11359.002.patch, 
> HDFS-11359.003.patch, HDFS-11359.004.patch, HDFS-11359.005.patch, 
> HDFS-11359-branch-2.patch
>
>
> The datanode's maintenance state info can be showed in webUI/JMX. But it 
> can't be displayed via CLI. This JIRA will improve on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11902) [READ] Merge BlockFormatProvider and TextFileRegionProvider into one.

2017-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034063#comment-16034063
 ] 

Hadoop QA commented on HDFS-11902:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
28s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
17s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 7s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
29s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
43s{color} | {color:green} HDFS-9806 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
30s{color} | {color:red} hadoop-tools/hadoop-fs2img in HDFS-9806 has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} HDFS-9806 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m 
33s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m  3s{color} | {color:orange} root: The patch generated 2 new + 433 unchanged 
- 3 fixed = 435 total (was 436) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 98m  0s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
41s{color} | {color:green} hadoop-fs2img in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDecommissionWithStriped |
|   | hadoop.hdfs.server.blockmanagement.TestComputeInvalidateWork |
|   | hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.server.namenode.TestNNThroughputBenchmark |
|   | hadoop.hdfs.server.namenode.TestDeadDatanode |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | 
hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness
 |
|   | hadoop.hdfs.server.namenode.ha.TestPendingCorruptDnMessages |
|   | hadoop.hdfs.server.datanode.TestDataNodeLifeline |
|   | 

[jira] [Commented] (HDFS-11359) DFSAdmin report command supports displaying maintenance state datanodes

2017-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034059#comment-16034059
 ] 

Hadoop QA commented on HDFS-11359:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
26s{color} | {color:red} Docker failed to build yetus/hadoop:8515d35. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-11359 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870908/HDFS-11359-branch-2.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19737/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> DFSAdmin report command supports displaying maintenance state datanodes
> ---
>
> Key: HDFS-11359
> URL: https://issues.apache.org/jira/browse/HDFS-11359
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11359.001.patch, HDFS-11359.002.patch, 
> HDFS-11359.003.patch, HDFS-11359.004.patch, HDFS-11359.005.patch, 
> HDFS-11359-branch-2.patch
>
>
> The datanode's maintenance state info can be showed in webUI/JMX. But it 
> can't be displayed via CLI. This JIRA will improve on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11359) DFSAdmin report command supports displaying maintenance state datanodes

2017-06-01 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-11359:
-
Attachment: HDFS-11359-branch-2.patch

The failure tests are not related and ASF License warnings are also not 
related, I dig it out and file the JIRA HDFS-11899 to have a fix.
Attach the patch for branch-2.

> DFSAdmin report command supports displaying maintenance state datanodes
> ---
>
> Key: HDFS-11359
> URL: https://issues.apache.org/jira/browse/HDFS-11359
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, namenode
>Affects Versions: 3.0.0-alpha1
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
>Priority: Minor
> Attachments: HDFS-11359.001.patch, HDFS-11359.002.patch, 
> HDFS-11359.003.patch, HDFS-11359.004.patch, HDFS-11359.005.patch, 
> HDFS-11359-branch-2.patch
>
>
> The datanode's maintenance state info can be showed in webUI/JMX. But it 
> can't be displayed via CLI. This JIRA will improve on this.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11472) Fix inconsistent replica size after a data pipeline failure

2017-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034052#comment-16034052
 ] 

Hadoop QA commented on HDFS-11472:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 34s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 120 unchanged - 1 fixed = 122 total (was 121) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 44s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
28s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}101m  3s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy |
|   | hadoop.hdfs.TestErasureCodeBenchmarkThroughput |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11472 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870896/HDFS-11472.002.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 8fd774ea4781 3.13.0-106-generic #153-Ubuntu SMP Tue Dec 6 
15:44:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 7101477 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19735/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19735/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19735/testReport/ |
| asflicense | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19735/artifact/patchprocess/patch-asflicense-problems.txt
 |
| modules | C: 

[jira] [Commented] (HDFS-11899) ASF License warnings generated intermittently in trunk

2017-06-01 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034049#comment-16034049
 ] 

Yiqun Lin commented on HDFS-11899:
--

Hi [~ajisakaa], I found ASF License warnings generated intermittently in 
trunk(see this link: 
https://issues.apache.org/jira/browse/HDFS-11359?focusedCommentId=16030563=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16030563),
 do you have some time to take a quick look on this? Similar to that I fixed in 
HDFS-11795.

> ASF License warnings generated intermittently in trunk
> --
>
> Key: HDFS-11899
> URL: https://issues.apache.org/jira/browse/HDFS-11899
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0-alpha3
>Reporter: Yiqun Lin
>Assignee: Yiqun Lin
> Attachments: HDFS-11899.001.patch
>
>
> Recently ASF License warnings generated intermittently in trunk.
> {noformat}
> Lines that start with ? in the ASF License  report indicate files that do 
> not have an Apache license header:
>  !? /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/include-hosts-file
>  !? /testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/exclude-hosts-file
> {noformat}
> The root cause of this is that the include/exclude host file created in a 
> wrong place in test {{TestBalacner}}. It's expected in a right test 
> directory. When some unit tests ran timeout in {{TestBalacner}}, then these 
> host files cannot be cleared and generated ASF License warnings.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11907) NameNodeResourceChecker should avoid calling df.getAvailable too frequently

2017-06-01 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034021#comment-16034021
 ] 

Andrew Wang commented on HDFS-11907:


Thanks for the reply [~vagarychen]. Just thought I'd mention the other class in 
case there was potential for code sharing.

Could you expand a little on why caching is necessary here? As Kihwal said, df 
is normally pretty cheap, so I'm curious why we need to do this. We could also 
get possibly the same outcome by increasing the monitorHealth check interval 
from 1s to 5s.

> NameNodeResourceChecker should avoid calling df.getAvailable too frequently
> ---
>
> Key: HDFS-11907
> URL: https://issues.apache.org/jira/browse/HDFS-11907
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-11907.001.patch, HDFS-11907.002.patch, 
> HDFS-11907.003.patch, HDFS-11907.004.patch
>
>
> Currently, {{HealthMonitor#doHealthChecks}} invokes 
> {{NameNode#monitorHealth}} which ends up invoking 
> {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once per 
> second by default. And NameNodeResourceChecker#isResourceAvailable invokes 
> {{df.getAvailable();}} every time it is called.
> Since available space information should rarely be changing dramatically at 
> the pace of per second. A cached value should be sufficient. i.e. only try to 
> get the updated value when the cached value is too old. otherwise simply 
> return the cached value. This way df.getAvailable() gets invoked less.
> Thanks [~arpitagarwal] for the offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11907) NameNodeResourceChecker should avoid calling df.getAvailable too frequently

2017-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034006#comment-16034006
 ] 

Hadoop QA commented on HDFS-11907:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 37s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 5 new + 111 unchanged - 0 fixed = 116 total (was 111) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 37s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}123m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestComputeInvalidateWork |
|   | hadoop.hdfs.server.namenode.TestNameNodeResourceChecker |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11907 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870883/HDFS-11907.004.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 9229a80e5966 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 
16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 7101477 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19733/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19733/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19733/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19733/console |
| Powered by | 

[jira] [Commented] (HDFS-11779) Ozone: KSM: add listBuckets

2017-06-01 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16034012#comment-16034012
 ] 

Weiwei Yang commented on HDFS-11779:


Rebased to latest code base. [~xyao], [~anu], [~linyiqun], [~nandakumar131], 
[~yuanbo], do you have any more comments to the latest patch? Please let me 
know. Thanks

> Ozone: KSM: add listBuckets
> ---
>
> Key: HDFS-11779
> URL: https://issues.apache.org/jira/browse/HDFS-11779
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Weiwei Yang
> Attachments: HDFS-11779-HDFS-7240.001.patch, 
> HDFS-11779-HDFS-7240.002.patch, HDFS-11779-HDFS-7240.003.patch, 
> HDFS-11779-HDFS-7240.004.patch, HDFS-11779-HDFS-7240.005.patch, 
> HDFS-11779-HDFS-7240.006.patch, HDFS-11779-HDFS-7240.007.patch, 
> HDFS-11779-HDFS-7240.008.patch, HDFS-11779-HDFS-7240.009.patch, 
> HDFS-11779-HDFS-7240.010.patch, HDFS-11779-HDFS-7240.011.patch
>
>
> Lists buckets of a given volume. Similar to listVolumes, paging supported via 
> prevKey, prefix and maxKeys.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11779) Ozone: KSM: add listBuckets

2017-06-01 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-11779:
---
Attachment: HDFS-11779-HDFS-7240.011.patch

> Ozone: KSM: add listBuckets
> ---
>
> Key: HDFS-11779
> URL: https://issues.apache.org/jira/browse/HDFS-11779
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Weiwei Yang
> Attachments: HDFS-11779-HDFS-7240.001.patch, 
> HDFS-11779-HDFS-7240.002.patch, HDFS-11779-HDFS-7240.003.patch, 
> HDFS-11779-HDFS-7240.004.patch, HDFS-11779-HDFS-7240.005.patch, 
> HDFS-11779-HDFS-7240.006.patch, HDFS-11779-HDFS-7240.007.patch, 
> HDFS-11779-HDFS-7240.008.patch, HDFS-11779-HDFS-7240.009.patch, 
> HDFS-11779-HDFS-7240.010.patch, HDFS-11779-HDFS-7240.011.patch
>
>
> Lists buckets of a given volume. Similar to listVolumes, paging supported via 
> prevKey, prefix and maxKeys.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11472) Fix inconsistent replica size after a data pipeline failure

2017-06-01 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11472:
---
Attachment: HDFS-11472.002.patch

Updated the patch to address the comment.
Also had a minor change in {{FsDatasetImpl#recoverRbwImpl}} to check if the rbw 
is a ReplicaBeingWritten. Skip truncation if it is not ReplicaBeingWritten.

> Fix inconsistent replica size after a data pipeline failure
> ---
>
> Key: HDFS-11472
> URL: https://issues.apache.org/jira/browse/HDFS-11472
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Critical
>  Labels: release-blocker
> Attachments: HDFS-11472.001.patch, HDFS-11472.002.patch, 
> HDFS-11472.testcase.patch
>
>
> We observed a case where a replica's on disk length is less than acknowledged 
> length, breaking the assumption in recovery code.
> {noformat}
> 2017-01-08 01:41:03,532 WARN 
> org.apache.hadoop.hdfs.server.protocol.InterDatanodeProtocol: Failed to 
> obtain replica info for block 
> (=BP-947993742-10.204.0.136-1362248978912:blk_2526438952_1101394519586) from 
> datanode (=DatanodeInfoWithStorage[10.204.138.17:1004,null,null])
> java.io.IOException: THIS IS NOT SUPPOSED TO HAPPEN: getBytesOnDisk() < 
> getVisibleLength(), rip=ReplicaBeingWritten, blk_2526438952_1101394519586, RBW
>   getNumBytes() = 27530
>   getBytesOnDisk()  = 27006
>   getVisibleLength()= 27268
>   getVolume()   = /data/6/hdfs/datanode/current
>   getBlockFile()= 
> /data/6/hdfs/datanode/current/BP-947993742-10.204.0.136-1362248978912/current/rbw/blk_2526438952
>   bytesAcked=27268
>   bytesOnDisk=27006
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.initReplicaRecovery(FsDatasetImpl.java:2284)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.initReplicaRecovery(FsDatasetImpl.java:2260)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initReplicaRecovery(DataNode.java:2566)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.callInitReplicaRecovery(DataNode.java:2577)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.recoverBlock(DataNode.java:2645)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.access$400(DataNode.java:245)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataNode$5.run(DataNode.java:2551)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> It turns out that if an exception is thrown within 
> {{BlockReceiver#receivePacket}}, the in-memory replica on disk length may not 
> be updated, but the data is written to disk anyway.
> For example, here's one exception we observed
> {noformat}
> 2017-01-08 01:40:59,512 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Exception for 
> BP-947993742-10.204.0.136-1362248978912:blk_2526438952_1101394499067
> java.nio.channels.ClosedByInterruptException
> at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:269)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.adjustCrcChannelPosition(FsDatasetImpl.java:1484)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.adjustCrcFilePosition(BlockReceiver.java:994)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:670)
> at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:857)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:797)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> There are potentially other places and causes where an exception is thrown 
> within {{BlockReceiver#receivePacket}}, so it may not make much sense to 
> alleviate it for this particular exception. Instead, we should improve 
> replica recovery code to handle the case where ondisk size is less than 
> acknowledged size, and update in-memory checksum accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11907) NameNodeResourceChecker should avoid calling df.getAvailable too frequently

2017-06-01 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033968#comment-16033968
 ] 

Arpit Agarwal commented on HDFS-11907:
--

I am +1 on the v4 patch. I will hold off committing, pending Andrew's response.

> NameNodeResourceChecker should avoid calling df.getAvailable too frequently
> ---
>
> Key: HDFS-11907
> URL: https://issues.apache.org/jira/browse/HDFS-11907
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-11907.001.patch, HDFS-11907.002.patch, 
> HDFS-11907.003.patch, HDFS-11907.004.patch
>
>
> Currently, {{HealthMonitor#doHealthChecks}} invokes 
> {{NameNode#monitorHealth}} which ends up invoking 
> {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once per 
> second by default. And NameNodeResourceChecker#isResourceAvailable invokes 
> {{df.getAvailable();}} every time it is called.
> Since available space information should rarely be changing dramatically at 
> the pace of per second. A cached value should be sufficient. i.e. only try to 
> get the updated value when the cached value is too old. otherwise simply 
> return the cached value. This way df.getAvailable() gets invoked less.
> Thanks [~arpitagarwal] for the offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11907) NameNodeResourceChecker should avoid calling df.getAvailable too frequently

2017-06-01 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033960#comment-16033960
 ] 

Chen Liang commented on HDFS-11907:
---

Thanks [~andrew.wang] for the comments! We prefer not to use it here though 
because:
1. the change of this JIRA is about maintaining *available space* value, while 
DFCachingGetSpaceUsed is to get *used space*. So we will have to make further 
modification to this class (or create new) if we want to use it. 
2. seems that each instance of this class will use an extra background thread 
that periodically updates the value, which seems a bit overkill to me. 

But if you do think it is better to use DFCachingGetSpaceUsed, I will try to 
update with another patch.

> NameNodeResourceChecker should avoid calling df.getAvailable too frequently
> ---
>
> Key: HDFS-11907
> URL: https://issues.apache.org/jira/browse/HDFS-11907
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-11907.001.patch, HDFS-11907.002.patch, 
> HDFS-11907.003.patch, HDFS-11907.004.patch
>
>
> Currently, {{HealthMonitor#doHealthChecks}} invokes 
> {{NameNode#monitorHealth}} which ends up invoking 
> {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once per 
> second by default. And NameNodeResourceChecker#isResourceAvailable invokes 
> {{df.getAvailable();}} every time it is called.
> Since available space information should rarely be changing dramatically at 
> the pace of per second. A cached value should be sufficient. i.e. only try to 
> get the updated value when the cached value is too old. otherwise simply 
> return the cached value. This way df.getAvailable() gets invoked less.
> Thanks [~arpitagarwal] for the offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11902) [READ] Merge BlockFormatProvider and TextFileRegionProvider into one.

2017-06-01 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033938#comment-16033938
 ] 

Virajith Jalaparti commented on HDFS-11902:
---

Thanks for taking a look [~chris.douglas]! Posting a modified patch (v2) which 
changes {{BlockProvider}} from an abstract class to an interface that extends 
{{Iterable}}. This change also removes the {{FileRegionProvider}} class. The 
patch has also been rebased on the latest version of HDFS-9806 branch.

> [READ] Merge BlockFormatProvider and TextFileRegionProvider into one.
> -
>
> Key: HDFS-11902
> URL: https://issues.apache.org/jira/browse/HDFS-11902
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11902-HDFS-9806.001.patch, 
> HDFS-11902-HDFS-9806.002.patch
>
>
> Currently {{BlockFormatProvider}} and {{TextFileRegionProvider}} perform 
> almost the same function on the Namenode and Datanode respectively. This JIRA 
> is to merge them into one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11902) [READ] Merge BlockFormatProvider and TextFileRegionProvider into one.

2017-06-01 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11902:
--
Status: Patch Available  (was: Open)

> [READ] Merge BlockFormatProvider and TextFileRegionProvider into one.
> -
>
> Key: HDFS-11902
> URL: https://issues.apache.org/jira/browse/HDFS-11902
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11902-HDFS-9806.001.patch, 
> HDFS-11902-HDFS-9806.002.patch
>
>
> Currently {{BlockFormatProvider}} and {{TextFileRegionProvider}} perform 
> almost the same function on the Namenode and Datanode respectively. This JIRA 
> is to merge them into one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11902) [READ] Merge BlockFormatProvider and TextFileRegionProvider into one.

2017-06-01 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11902:
--
Status: Open  (was: Patch Available)

> [READ] Merge BlockFormatProvider and TextFileRegionProvider into one.
> -
>
> Key: HDFS-11902
> URL: https://issues.apache.org/jira/browse/HDFS-11902
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11902-HDFS-9806.001.patch, 
> HDFS-11902-HDFS-9806.002.patch
>
>
> Currently {{BlockFormatProvider}} and {{TextFileRegionProvider}} perform 
> almost the same function on the Namenode and Datanode respectively. This JIRA 
> is to merge them into one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11902) [READ] Merge BlockFormatProvider and TextFileRegionProvider into one.

2017-06-01 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11902:
--
Attachment: HDFS-11902-HDFS-9806.002.patch

> [READ] Merge BlockFormatProvider and TextFileRegionProvider into one.
> -
>
> Key: HDFS-11902
> URL: https://issues.apache.org/jira/browse/HDFS-11902
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11902-HDFS-9806.001.patch, 
> HDFS-11902-HDFS-9806.002.patch
>
>
> Currently {{BlockFormatProvider}} and {{TextFileRegionProvider}} perform 
> almost the same function on the Namenode and Datanode respectively. This JIRA 
> is to merge them into one.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11907) NameNodeResourceChecker should avoid calling df.getAvailable too frequently

2017-06-01 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033927#comment-16033927
 ] 

Andrew Wang commented on HDFS-11907:


Sorry for coming to this late, but is DFCachingGetSpaceUsed useful here? Seems 
related / similar.

> NameNodeResourceChecker should avoid calling df.getAvailable too frequently
> ---
>
> Key: HDFS-11907
> URL: https://issues.apache.org/jira/browse/HDFS-11907
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-11907.001.patch, HDFS-11907.002.patch, 
> HDFS-11907.003.patch, HDFS-11907.004.patch
>
>
> Currently, {{HealthMonitor#doHealthChecks}} invokes 
> {{NameNode#monitorHealth}} which ends up invoking 
> {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once per 
> second by default. And NameNodeResourceChecker#isResourceAvailable invokes 
> {{df.getAvailable();}} every time it is called.
> Since available space information should rarely be changing dramatically at 
> the pace of per second. A cached value should be sufficient. i.e. only try to 
> get the updated value when the cached value is too old. otherwise simply 
> return the cached value. This way df.getAvailable() gets invoked less.
> Thanks [~arpitagarwal] for the offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11912) Add a snapshot unit test with randomized file IO operations

2017-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033920#comment-16033920
 ] 

Hadoop QA commented on HDFS-11912:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 35s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 105 new + 0 unchanged - 0 fixed = 105 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
38s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 94m 57s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}125m 13s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestMaintenanceState |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure160 |
|   | hadoop.hdfs.TestErasureCodeBenchmarkThroughput |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11912 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870860/HDFS-11912.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 991750f8746b 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 
16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 219f4c1 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19732/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19732/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19732/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19732/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   

[jira] [Updated] (HDFS-11907) NameNodeResourceChecker should avoid calling df.getAvailable too frequently

2017-06-01 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-11907:
--
Attachment: HDFS-11907.004.patch

Thanks [~arpitagarwal] and [~kihwal] for the comments! Post v004 patch to 
address Arpit's comments.

> NameNodeResourceChecker should avoid calling df.getAvailable too frequently
> ---
>
> Key: HDFS-11907
> URL: https://issues.apache.org/jira/browse/HDFS-11907
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-11907.001.patch, HDFS-11907.002.patch, 
> HDFS-11907.003.patch, HDFS-11907.004.patch
>
>
> Currently, {{HealthMonitor#doHealthChecks}} invokes 
> {{NameNode#monitorHealth}} which ends up invoking 
> {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once per 
> second by default. And NameNodeResourceChecker#isResourceAvailable invokes 
> {{df.getAvailable();}} every time it is called.
> Since available space information should rarely be changing dramatically at 
> the pace of per second. A cached value should be sufficient. i.e. only try to 
> get the updated value when the cached value is too old. otherwise simply 
> return the cached value. This way df.getAvailable() gets invoked less.
> Thanks [~arpitagarwal] for the offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11673) [READ] Handle failures of Datanode with PROVIDED storage

2017-06-01 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033879#comment-16033879
 ] 

Virajith Jalaparti edited comment on HDFS-11673 at 6/1/17 11:08 PM:


Failed tests and findbugs are unrelated. Committed v5 to the feature branch! 
Thanks [~chris.douglas] for taking a look at the patch.


was (Author: virajith):
Committed v5 to the feature branch! Thanks [~chris.douglas] for taking a look 
at the patch.

> [READ] Handle failures of Datanode with PROVIDED storage
> 
>
> Key: HDFS-11673
> URL: https://issues.apache.org/jira/browse/HDFS-11673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11673-HDFS-9806.001.patch, 
> HDFS-11673-HDFS-9806.002.patch, HDFS-11673-HDFS-9806.003.patch, 
> HDFS-11673-HDFS-9806.004.patch, HDFS-11673-HDFS-9806.005.patch
>
>
> Blocks on {{PROVIDED}} storage should become unavailable if and only if all 
> Datanodes that are configured with {{PROVIDED}} storage become unavailable. 
> Even if one Datanode with {{PROVIDED}} storage is available, all blocks on 
> the {{PROVIDED}} storage should be accessible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11673) [READ] Handle failures of Datanode with PROVIDED storage

2017-06-01 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11673:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> [READ] Handle failures of Datanode with PROVIDED storage
> 
>
> Key: HDFS-11673
> URL: https://issues.apache.org/jira/browse/HDFS-11673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11673-HDFS-9806.001.patch, 
> HDFS-11673-HDFS-9806.002.patch, HDFS-11673-HDFS-9806.003.patch, 
> HDFS-11673-HDFS-9806.004.patch, HDFS-11673-HDFS-9806.005.patch
>
>
> Blocks on {{PROVIDED}} storage should become unavailable if and only if all 
> Datanodes that are configured with {{PROVIDED}} storage become unavailable. 
> Even if one Datanode with {{PROVIDED}} storage is available, all blocks on 
> the {{PROVIDED}} storage should be accessible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11673) [READ] Handle failures of Datanode with PROVIDED storage

2017-06-01 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033879#comment-16033879
 ] 

Virajith Jalaparti commented on HDFS-11673:
---

Committed v5 to the feature branch! Thanks [~chris.douglas] for taking a look 
at the patch.

> [READ] Handle failures of Datanode with PROVIDED storage
> 
>
> Key: HDFS-11673
> URL: https://issues.apache.org/jira/browse/HDFS-11673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11673-HDFS-9806.001.patch, 
> HDFS-11673-HDFS-9806.002.patch, HDFS-11673-HDFS-9806.003.patch, 
> HDFS-11673-HDFS-9806.004.patch, HDFS-11673-HDFS-9806.005.patch
>
>
> Blocks on {{PROVIDED}} storage should become unavailable if and only if all 
> Datanodes that are configured with {{PROVIDED}} storage become unavailable. 
> Even if one Datanode with {{PROVIDED}} storage is available, all blocks on 
> the {{PROVIDED}} storage should be accessible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11597) Ozone: Add Ratis management API

2017-06-01 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-11597:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-7240
   Status: Resolved  (was: Patch Available)

[~szetszwo] Thank you for the contribution. I have committed this to the 
feature branch.

> Ozone: Add Ratis management API
> ---
>
> Key: HDFS-11597
> URL: https://issues.apache.org/jira/browse/HDFS-11597
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Fix For: HDFS-7240
>
> Attachments: HDFS-11597-HDFS-7240.20170522.patch, 
> HDFS-11597-HDFS-7240.20170523.patch, HDFS-11597-HDFS-7240.20170524.patch, 
> HDFS-11597-HDFS-7240.20170528b.patch, HDFS-11597-HDFS-7240.20170528.patch, 
> HDFS-11597-HDFS-7240.20170529.patch
>
>
> We need APIs to manage Ratis clusters for the following operations:
> - create cluster;
> - close cluster;
> - get members; and
> - update members.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11383) Intern strings in BlockLocation and ExtendedBlock

2017-06-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033839#comment-16033839
 ] 

Hudson commented on HDFS-11383:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11815 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11815/])
HDFS-11383. Intern strings in BlockLocation and ExtendedBlock. (wang: rev 
7101477e4726a70ab0eab57c2d4480a04446a8dc)
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/StringInterner.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/ExtendedBlock.java
* (edit) 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/BlockLocation.java


> Intern strings in BlockLocation and ExtendedBlock
> -
>
> Key: HDFS-11383
> URL: https://issues.apache.org/jira/browse/HDFS-11383
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11383.01.patch, HDFS-11383.02.patch, 
> HDFS-11383.03.patch, HDFS-11383.04.patch, hs2-crash-2.txt
>
>
> I am working on Hive performance, investigating the problem of high memory 
> pressure when (a) a table consists of a high number (thousands) of partitions 
> and (b) multiple queries run against it concurrently. It turns out that a lot 
> of memory is wasted due to data duplication. One source of duplicate strings 
> is class org.apache.hadoop.fs.BlockLocation. Its fields such as storageIds, 
> topologyPaths, hosts, names, may collectively use up to 6% of memory in my 
> benchmark, causing (together with other problematic classes) a huge memory 
> spike. Of these 6% of memory taken by BlockLocation strings, more than 5% are 
> wasted due to duplication.
> I think we need to add calls to String.intern() in the BlockLocation 
> constructor, like:
> {code}
> this.hosts = internStringsInArray(hosts);
> ...
> private void internStringsInArray(String[] sar) {
>   for (int i = 0; i < sar.length; i++) {
> sar[i] = sar[i].intern();
>   }
> }
> {code}
> String.intern() performs very well starting from JDK 7. I've found some 
> articles explaining the progress that was made by the HotSpot JVM developers 
> in this area, verified that with benchmarks myself, and finally added quite a 
> bit of interning to one of the Cloudera products without any issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11909) Ozone: KSM : Support for simulated file system operations

2017-06-01 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033835#comment-16033835
 ] 

Mingliang Liu commented on HDFS-11909:
--

Thanks [~anu] for the design doc. Nice discussion here. I think 
[~ste...@apache.org] got pinged, so he may chime in as well.

{quote}
I am not sure how useful is that to differentiate file with dir in API, 
getSimulatedFiles and getSimulatedDirectories. In file system they are all 
FileStatus. When we want to implement listStatus, the most convenient API 
wanted would be a call simply returns all paths under a path, along with an 
attribute to indicate itself a dir or a file. Otherwise we will end up with 
calling 2 APIs here, less favored.
{quote}
This makes sense. Alternatively one single API (maybe listKeys) returns a 
response containing both top-level files and "directories". One motivation to 
separate the files/directories in API level is to support millions of children. 
In that case the response will be truncated by client side (using {{prev_key}} 
and {{max_keys}}), and one single response contains partial value. We issue 
paging requests to get all the children of a directory. So saving the wasted 
information seems a good idea if we only need files or directories. When I 
discussed with Anu offline, one specific use case is non-recursive 
{{FileSystem::listFiles()}} API, which needs top-level files only. That will 
save the effort of returning directories which are useless anyway. For 
recursive {{FileSystem::listFiles()}}, we can simply use the 
{{getSimulatedTree}} on demand at client side. We can think about other use 
cases which may benefit from the separation API. If the single listKeys can 
work just fine for those cases, I'd prefer a single API as well, which is 
indeed simple and favored.

FWIW, currently all the glob and path filter in HDFS is executed at client 
side. I think this _maybe_ because we prefer less load on NameNode to saving 
network bandwidth for cases most of the files and directories returned are 
filtered out. In an ideal word, there is server side path filter. We can talk 
about this later if it's too crazy.

{quote}
We certainly could that or for the purpose of these API we will treat one or 
more / as a single slash, either one will work.
{quote}
Have strict name convention is not a bad thing to me. It's clear without 
sacrificing important use cases.

> Ozone: KSM :  Support for simulated file system operations
> --
>
> Key: HDFS-11909
> URL: https://issues.apache.org/jira/browse/HDFS-11909
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: simulation-file-system.pdf
>
>
> This JIRA adds a proposal that makes it easy to implement OzoneFileSystem. 
> This allows the directory and file list operations simpler.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11597) Ozone: Add Ratis management API

2017-06-01 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033834#comment-16033834
 ] 

Anu Engineer commented on HDFS-11597:
-

+1, I will commit this shortly.

> Ozone: Add Ratis management API
> ---
>
> Key: HDFS-11597
> URL: https://issues.apache.org/jira/browse/HDFS-11597
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
> Attachments: HDFS-11597-HDFS-7240.20170522.patch, 
> HDFS-11597-HDFS-7240.20170523.patch, HDFS-11597-HDFS-7240.20170524.patch, 
> HDFS-11597-HDFS-7240.20170528b.patch, HDFS-11597-HDFS-7240.20170528.patch, 
> HDFS-11597-HDFS-7240.20170529.patch
>
>
> We need APIs to manage Ratis clusters for the following operations:
> - create cluster;
> - close cluster;
> - get members; and
> - update members.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-06-01 Thread John Zhuge (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033830#comment-16033830
 ] 

John Zhuge commented on HDFS-11856:
---

[~vinayrpet] and [~kihwal] Will this patch help clusters with more than 3 DNs? 
We saw HBase RegionServer occasional crashing with DamagedWALException after 
the following pipeline recovery failure:
{noformat}
java.io.IOException: All datanodes 
DatanodeInfoWithStorage[x.x.x.x:20002,DS-uuid,DISK] are bad. Aborting...
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1465)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:1236)
at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:721)
{noformat}

> Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline 
> updates
> --
>
> Key: HDFS-11856
> URL: https://issues.apache.org/jira/browse/HDFS-11856
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, rolling upgrades
>Affects Versions: 2.7.3
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HDFS-11856-01.patch, HDFS-11856-02.branch-2.patch, 
> HDFS-11856-02.patch, HDFS-11856-branch-2-02.patch, 
> HDFS-11856-branch-2.7-02.patch, HDFS-11856-branch-2.8-02.patch
>
>
> During rolling upgrade if the DN gets restarted, then it will send special 
> OOB_RESTART status to all streams opened for write.
> 1. Local clients will wait for 30 seconds to datanode to come back.
> 2. Remote clients will consider these nodes as bad nodes and continue with 
> pipeline recoveries and write. These restarted nodes will be considered as 
> bad, and will be excluded for lifetime of stream.
> In case of small cluster, where total nodes itself is 3, each time a remote 
> node restarts for upgrade, it will be excluded.
> So a stream writing to 3 nodes initial, will end-up writing to only one node 
> at the end, there are no other nodes to replace.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11909) Ozone: KSM : Support for simulated file system operations

2017-06-01 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033828#comment-16033828
 ] 

Anu Engineer commented on HDFS-11909:
-

bq. with name "key1/" and another "key1//"

In the web interface we just parse the name into volume, bucket and key -- the 
key part is not really parsed, but rather persisted into a levelDB as a key. I 
have not experimented with this, but I am going to guess that this might work. 
Now for the better part, suppose it does not work, then we don't have this 
problem at all :) So if we parse "key1/" and "key1//" to same string in web, 
then user might get an error while the key is getting created.


> Ozone: KSM :  Support for simulated file system operations
> --
>
> Key: HDFS-11909
> URL: https://issues.apache.org/jira/browse/HDFS-11909
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: simulation-file-system.pdf
>
>
> This JIRA adds a proposal that makes it easy to implement OzoneFileSystem. 
> This allows the directory and file list operations simpler.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11912) Add a snapshot unit test with randomized file IO operations

2017-06-01 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033821#comment-16033821
 ] 

Manoj Govindassamy edited comment on HDFS-11912 at 6/1/17 10:27 PM:


Thanks for contributing this patch [~ghuangups]. Looks good overall. Few 
comments from the quick look. Will add more comments later.

In HDFS-9406, snapshot operations were believed to causing metadata 
inconsistencies in the fsimage. Can you please try running this new test 
without the fix for HDFS-9406 and see if it can recreate the problem? 

1.
{noformat}
if (randomNum > currentWeightSum && randomNum <= (currentWeightSum + 
currentValue.getWeight())) {
  snapshotRandomOp = currentValue;
  break;
}
{noformat}
Shouldn't the check be just (randomNum < (currentWeightSum + 
currentValue.getWeight())

2.
{noformat}
  private static MiniDFSCluster cluster;
  private static DistributedFileSystem hdfs;
  private static Random GENERATOR = null;
{noformat}
Above class members need not be static.

3.
{{FileSystemOperations}} and {{SnapshotOperations}} are very similar except for 
enum values and weights. Code duplication here can be avoided if we can merge 
these two enums into one and expose proper methods.

4.
{noformat}
// Set
Random RANDOM = new Random();
long seed = RANDOM.nextLong();
GENERATOR = new Random(seed);
{noformat}
Any specific reason why a simple seed like System.currentTimeMillis() will not 
be useful here ? Here seed is generated from random, which is inturn is not 
seeded. Also, RANDOM need not be all caps.

5.
{noformat}
int fileLen = new Random().nextInt(MAX_NUM_FILE_LENGTH);
createFiles(testDirString, fileLen);
{noformat}
GENERATOR random can be used here instead of creating a new one.

6.
{noformat}
// Create files in a directory with random depth, ranging from 0-10.
for (int i = 0; i < TOTAL_BLOCKS; i += fileLength) {
{noformat}
Is this TOTAL_BLOCKS are total files ?

7.
{noformat}
private String GetNewPathString(String originalString,
{noformat}
Metnhod name should be in camel case, like getNewPathString()




was (Author: manojg):
Thanks for contributing this patch [~ghuangups]. Few comments from the quick 
look. Will add more comments later.

In HDFS-9406, snapshot operations were believed to causing metadata 
inconsistencies in the fsimage. Can you please try running this new test 
without the fix for HDFS-9406 and see if it can recreate the problem? 

1.
{noformat}
if (randomNum > currentWeightSum && randomNum <= (currentWeightSum + 
currentValue.getWeight())) {
  snapshotRandomOp = currentValue;
  break;
}
{noformat}
Shouldn't the check be just (randomNum < (currentWeightSum + 
currentValue.getWeight())

2.
{noformat}
  private static MiniDFSCluster cluster;
  private static DistributedFileSystem hdfs;
  private static Random GENERATOR = null;
{noformat}
Above class members need not be static.

3.
{{FileSystemOperations}} and {{SnapshotOperations}} are very similar except for 
enum values and weights. Code duplication here can be avoided if we can merge 
these two enums into one and expose proper methods.

4.
{noformat}
// Set
Random RANDOM = new Random();
long seed = RANDOM.nextLong();
GENERATOR = new Random(seed);
{noformat}
Any specific reason why a simple seed like System.currentTimeMillis() will not 
be useful here ? Here seed is generated from random, which is inturn is not 
seeded. Also, RANDOM need not be all caps.

5.
{noformat}
int fileLen = new Random().nextInt(MAX_NUM_FILE_LENGTH);
createFiles(testDirString, fileLen);
{noformat}
GENERATOR random can be used here instead of creating a new one.

6.
{noformat}
// Create files in a directory with random depth, ranging from 0-10.
for (int i = 0; i < TOTAL_BLOCKS; i += fileLength) {
{noformat}
Is this TOTAL_BLOCKS are total files ?

7.
{noformat}
private String GetNewPathString(String originalString,
{noformat}
Metnhod name should be in camel case, like getNewPathString()



> Add a snapshot unit test with randomized file IO operations
> ---
>
> Key: HDFS-11912
> URL: https://issues.apache.org/jira/browse/HDFS-11912
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Reporter: George Huang
>Priority: Minor
> Attachments: HDFS-11912.001.patch
>
>
> Adding a snapshot unit test with randomized file IO operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11912) Add a snapshot unit test with randomized file IO operations

2017-06-01 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033821#comment-16033821
 ] 

Manoj Govindassamy commented on HDFS-11912:
---

Thanks for contributing this patch [~ghuangups]. Few comments from the quick 
look. Will add more comments later.

In HDFS-9406, snapshot operations were believed to causing metadata 
inconsistencies in the fsimage. Can you please try running this new test 
without the fix for HDFS-9406 and see if it can recreate the problem? 

1.
{noformat}
if (randomNum > currentWeightSum && randomNum <= (currentWeightSum + 
currentValue.getWeight())) {
  snapshotRandomOp = currentValue;
  break;
}
{noformat}
Shouldn't the check be just (randomNum < (currentWeightSum + 
currentValue.getWeight())

2.
{noformat}
  private static MiniDFSCluster cluster;
  private static DistributedFileSystem hdfs;
  private static Random GENERATOR = null;
{noformat}
Above class members need not be static.

3.
{{FileSystemOperations}} and {{SnapshotOperations}} are very similar except for 
enum values and weights. Code duplication here can be avoided if we can merge 
these two enums into one and expose proper methods.

4.
{noformat}
// Set
Random RANDOM = new Random();
long seed = RANDOM.nextLong();
GENERATOR = new Random(seed);
{noformat}
Any specific reason why a simple seed like System.currentTimeMillis() will not 
be useful here ? Here seed is generated from random, which is inturn is not 
seeded. Also, RANDOM need not be all caps.

5.
{noformat}
int fileLen = new Random().nextInt(MAX_NUM_FILE_LENGTH);
createFiles(testDirString, fileLen);
{noformat}
GENERATOR random can be used here instead of creating a new one.

6.
{noformat}
// Create files in a directory with random depth, ranging from 0-10.
for (int i = 0; i < TOTAL_BLOCKS; i += fileLength) {
{noformat}
Is this TOTAL_BLOCKS are total files ?

7.
{noformat}
private String GetNewPathString(String originalString,
{noformat}
Metnhod name should be in camel case, like getNewPathString()



> Add a snapshot unit test with randomized file IO operations
> ---
>
> Key: HDFS-11912
> URL: https://issues.apache.org/jira/browse/HDFS-11912
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Reporter: George Huang
>Priority: Minor
> Attachments: HDFS-11912.001.patch
>
>
> Adding a snapshot unit test with randomized file IO operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11909) Ozone: KSM : Support for simulated file system operations

2017-06-01 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033818#comment-16033818
 ] 

Weiwei Yang commented on HDFS-11909:


Hi [~yuanbo] and [~anu], for file system APIs, we certainly can handle the 
extra slashes in file system APIs, but how can we handle this in ozone web 
handler side? If user calls rest API to create a key with name "key1/" and 
another "key1//"?

> Ozone: KSM :  Support for simulated file system operations
> --
>
> Key: HDFS-11909
> URL: https://issues.apache.org/jira/browse/HDFS-11909
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: simulation-file-system.pdf
>
>
> This JIRA adds a proposal that makes it easy to implement OzoneFileSystem. 
> This allows the directory and file list operations simpler.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11383) Intern strings in BlockLocation and ExtendedBlock

2017-06-01 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11383:
---
Summary: Intern strings in BlockLocation and ExtendedBlock  (was: String 
duplication in org.apache.hadoop.fs.BlockLocation)

> Intern strings in BlockLocation and ExtendedBlock
> -
>
> Key: HDFS-11383
> URL: https://issues.apache.org/jira/browse/HDFS-11383
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11383.01.patch, HDFS-11383.02.patch, 
> HDFS-11383.03.patch, HDFS-11383.04.patch, hs2-crash-2.txt
>
>
> I am working on Hive performance, investigating the problem of high memory 
> pressure when (a) a table consists of a high number (thousands) of partitions 
> and (b) multiple queries run against it concurrently. It turns out that a lot 
> of memory is wasted due to data duplication. One source of duplicate strings 
> is class org.apache.hadoop.fs.BlockLocation. Its fields such as storageIds, 
> topologyPaths, hosts, names, may collectively use up to 6% of memory in my 
> benchmark, causing (together with other problematic classes) a huge memory 
> spike. Of these 6% of memory taken by BlockLocation strings, more than 5% are 
> wasted due to duplication.
> I think we need to add calls to String.intern() in the BlockLocation 
> constructor, like:
> {code}
> this.hosts = internStringsInArray(hosts);
> ...
> private void internStringsInArray(String[] sar) {
>   for (int i = 0; i < sar.length; i++) {
> sar[i] = sar[i].intern();
>   }
> }
> {code}
> String.intern() performs very well starting from JDK 7. I've found some 
> articles explaining the progress that was made by the HotSpot JVM developers 
> in this area, verified that with benchmarks myself, and finally added quite a 
> bit of interning to one of the Cloudera products without any issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11383) Intern strings in BlockLocation and ExtendedBlock

2017-06-01 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11383:
---
   Resolution: Fixed
Fix Version/s: 3.0.0-alpha4
   2.9.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2, thanks again Misha for working on this!

> Intern strings in BlockLocation and ExtendedBlock
> -
>
> Key: HDFS-11383
> URL: https://issues.apache.org/jira/browse/HDFS-11383
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11383.01.patch, HDFS-11383.02.patch, 
> HDFS-11383.03.patch, HDFS-11383.04.patch, hs2-crash-2.txt
>
>
> I am working on Hive performance, investigating the problem of high memory 
> pressure when (a) a table consists of a high number (thousands) of partitions 
> and (b) multiple queries run against it concurrently. It turns out that a lot 
> of memory is wasted due to data duplication. One source of duplicate strings 
> is class org.apache.hadoop.fs.BlockLocation. Its fields such as storageIds, 
> topologyPaths, hosts, names, may collectively use up to 6% of memory in my 
> benchmark, causing (together with other problematic classes) a huge memory 
> spike. Of these 6% of memory taken by BlockLocation strings, more than 5% are 
> wasted due to duplication.
> I think we need to add calls to String.intern() in the BlockLocation 
> constructor, like:
> {code}
> this.hosts = internStringsInArray(hosts);
> ...
> private void internStringsInArray(String[] sar) {
>   for (int i = 0; i < sar.length; i++) {
> sar[i] = sar[i].intern();
>   }
> }
> {code}
> String.intern() performs very well starting from JDK 7. I've found some 
> articles explaining the progress that was made by the HotSpot JVM developers 
> in this area, verified that with benchmarks myself, and finally added quite a 
> bit of interning to one of the Cloudera products without any issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11894) Ozone: Cleanup imports

2017-06-01 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033809#comment-16033809
 ] 

Weiwei Yang commented on HDFS-11894:


Thanks [~anu] for manually triggering the jenkins job and committing the patch.

> Ozone: Cleanup imports
> --
>
> Key: HDFS-11894
> URL: https://issues.apache.org/jira/browse/HDFS-11894
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Weiwei Yang
>Priority: Trivial
> Fix For: HDFS-7240
>
> Attachments: HDFS-11894-HDFS-7240.001.patch
>
>
> As discussed in HDFS-11846, We have some imports like Nullable and 
> NonNullable that might be imported from packages which we don't intend to 
> take dependencies on. This JIRA tracks cleaning up those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11383) String duplication in org.apache.hadoop.fs.BlockLocation

2017-06-01 Thread Andrew Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033807#comment-16033807
 ] 

Andrew Wang commented on HDFS-11383:


LGTM thanks Misha, will commit shortly.

> String duplication in org.apache.hadoop.fs.BlockLocation
> 
>
> Key: HDFS-11383
> URL: https://issues.apache.org/jira/browse/HDFS-11383
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HDFS-11383.01.patch, HDFS-11383.02.patch, 
> HDFS-11383.03.patch, HDFS-11383.04.patch, hs2-crash-2.txt
>
>
> I am working on Hive performance, investigating the problem of high memory 
> pressure when (a) a table consists of a high number (thousands) of partitions 
> and (b) multiple queries run against it concurrently. It turns out that a lot 
> of memory is wasted due to data duplication. One source of duplicate strings 
> is class org.apache.hadoop.fs.BlockLocation. Its fields such as storageIds, 
> topologyPaths, hosts, names, may collectively use up to 6% of memory in my 
> benchmark, causing (together with other problematic classes) a huge memory 
> spike. Of these 6% of memory taken by BlockLocation strings, more than 5% are 
> wasted due to duplication.
> I think we need to add calls to String.intern() in the BlockLocation 
> constructor, like:
> {code}
> this.hosts = internStringsInArray(hosts);
> ...
> private void internStringsInArray(String[] sar) {
>   for (int i = 0; i < sar.length; i++) {
> sar[i] = sar[i].intern();
>   }
> }
> {code}
> String.intern() performs very well starting from JDK 7. I've found some 
> articles explaining the progress that was made by the HotSpot JVM developers 
> in this area, verified that with benchmarks myself, and finally added quite a 
> bit of interning to one of the Cloudera products without any issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11789) Maintain Short-Circuit Read Statistics

2017-06-01 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-11789:
-
Target Version/s: 2.9.0

> Maintain Short-Circuit Read Statistics
> --
>
> Key: HDFS-11789
> URL: https://issues.apache.org/jira/browse/HDFS-11789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11789.001.patch
>
>
> If a disk or controller hardware is faulty then short-circuit read requests 
> can stall indefinitely while reading from the file descriptor. Currently 
> there is no way to detect when short-circuit read requests are slow or 
> blocked. 
> This Jira proposes that each BlockReaderLocal maintain read statistics while 
> it is active by measuring the time taken for a pre-determined fraction of 
> read requests. These per-reader stats can be aggregated into global stats 
> when the reader is closed. The aggregate statistics can be exposed via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11789) Maintain Short-Circuit Read Statistics

2017-06-01 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-11789:
-
Component/s: hdfs-client

> Maintain Short-Circuit Read Statistics
> --
>
> Key: HDFS-11789
> URL: https://issues.apache.org/jira/browse/HDFS-11789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11789.001.patch
>
>
> If a disk or controller hardware is faulty then short-circuit read requests 
> can stall indefinitely while reading from the file descriptor. Currently 
> there is no way to detect when short-circuit read requests are slow or 
> blocked. 
> This Jira proposes that each BlockReaderLocal maintain read statistics while 
> it is active by measuring the time taken for a pre-determined fraction of 
> read requests. These per-reader stats can be aggregated into global stats 
> when the reader is closed. The aggregate statistics can be exposed via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11741) Long running balancer may fail due to expired DataEncryptionKey

2017-06-01 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033803#comment-16033803
 ] 

Wei-Chiu Chuang commented on HDFS-11741:


Thanks [~xiaochen] and [~yzhangal] for pushing the patch to the finish line.

> Long running balancer may fail due to expired DataEncryptionKey
> ---
>
> Key: HDFS-11741
> URL: https://issues.apache.org/jira/browse/HDFS-11741
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
> Environment: CDH5.8.2, Kerberos, Data transfer encryption enabled. 
> Balancer login using keytab
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2
>
> Attachments: block keys.png, HDFS-11741.001.patch, 
> HDFS-11741.002.patch, HDFS-11741.003.patch, HDFS-11741.004.patch, 
> HDFS-11741.005.patch, HDFS-11741.06.patch, HDFS-11741.07.patch, 
> HDFS-11741.08.patch, HDFS-11741.branch-2.01.patch
>
>
> We found a long running balancer may fail despite using keytab, because 
> KeyManager returns expired DataEncryptionKey, and it throws the following 
> exception:
> {noformat}
> 2017-04-30 05:03:58,661 WARN  [pool-1464-thread-10] balancer.Dispatcher 
> (Dispatcher.java:dispatch(325)) - Failed to move blk_1067352712_3913241 with 
> size=546650 from 10.0.0.134:50010:DISK to 10.0.0.98:50010:DISK through 
> 10.0.0.134:50010
> org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
> Can't re-compute encryption key for nonce, since the required block key 
> (keyID=1005215027) doesn't exist. Current key: 1005215030
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:417)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:474)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2300(Dispatcher.java:182)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:899)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This bug is similar in nature to HDFS-10609. While balancer KeyManager 
> actively synchronizes itself with NameNode w.r.t block keys, it does not 
> update DataEncryptionKey accordingly.
> In a specific cluster, with Kerberos ticket life time 10 hours, and default 
> block token expiration/life time 10 hours, a long running balancer failed 
> after 20~30 hours.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11789) Maintain Short-Circuit Read Statistics

2017-06-01 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033796#comment-16033796
 ] 

Arpit Agarwal commented on HDFS-11789:
--

Thanks for contributing this improvement [~hanishakoneru]. A few comments:

# For maintaining stats one more option is RollingAverages which uses 
MutableRatesWithAggregation and is optimized for multithreaded updates.
# Instead of floating point arithmetic here:
{code}
sampleRangeMax = (int) ((double) conf.getScrMetricsSamplingPercentage()
/ 100 * Integer.MAX_VALUE);
{code}
Alternatively:
{code}
sampleRangeMax = (Integer.MAX_VALUE / 100) * 
conf.getScrMetricsSamplingPercentage();
{code}
Also we should limit getScrMetricsSamplingPercentage to \[0, 100\] if the 
administrator misconfigures it.
# We should add isolated test cases for BlockReaderIoProvider and 
BlockReaderLocalMetrics if possible.

> Maintain Short-Circuit Read Statistics
> --
>
> Key: HDFS-11789
> URL: https://issues.apache.org/jira/browse/HDFS-11789
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
> Attachments: HDFS-11789.001.patch
>
>
> If a disk or controller hardware is faulty then short-circuit read requests 
> can stall indefinitely while reading from the file descriptor. Currently 
> there is no way to detect when short-circuit read requests are slow or 
> blocked. 
> This Jira proposes that each BlockReaderLocal maintain read statistics while 
> it is active by measuring the time taken for a pre-determined fraction of 
> read requests. These per-reader stats can be aggregated into global stats 
> when the reader is closed. The aggregate statistics can be exposed via JMX.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11904) Reuse iip in unprotectedRemoveXAttrs calls

2017-06-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11904?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033792#comment-16033792
 ] 

Hudson commented on HDFS-11904:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11814 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11814/])
HDFS-11904. Reuse iip in unprotectedRemoveXAttrs calls. (xiao: rev 
219f4c199e45f8ce7f41192493bf0dc8f1e5dc30)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirErasureCodingOp.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirXAttrOp.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java


> Reuse iip in unprotectedRemoveXAttrs calls
> --
>
> Key: HDFS-11904
> URL: https://issues.apache.org/jira/browse/HDFS-11904
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.7.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11904.01.patch
>
>
> In HDFS-10939, {{unprotectedSetXAttrs}} was optimized to use IIP instead of 
> path string.
> This jira is to do the same on {{unprotectedRemoveXAttrs}}.
> No performance test specifically for this is done yet, but it's not hard to 
> see a usage pattern of frequent removexattr could induce perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10816) TestComputeInvalidateWork#testDatanodeReRegistration fails due to race between test and replication monitor

2017-06-01 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated HDFS-10816:
---
Status: Patch Available  (was: Open)

> TestComputeInvalidateWork#testDatanodeReRegistration fails due to race 
> between test and replication monitor
> ---
>
> Key: HDFS-10816
> URL: https://issues.apache.org/jira/browse/HDFS-10816
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10816.001.patch, HDFS-10816.002.patch, 
> HDFS-10816-branch-2.002.patch
>
>
> {noformat}
> java.lang.AssertionError: Expected invalidate blocks to be the number of DNs 
> expected:<3> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestComputeInvalidateWork.testDatanodeReRegistration(TestComputeInvalidateWork.java:160)
> {noformat}
> The test fails because of a race condition between the test and the 
> replication monitor. The default replication monitor interval is 3 seconds, 
> which is just about how long the test normally takes to run. The test deletes 
> a file and then subsequently gets the namesystem writelock. However, if the 
> replication monitor fires in between those two instructions, the test will 
> fail as it will itself invalidate one of the blocks. This can be easily 
> reproduced by removing the sleep() in the ReplicationMonitor's run() method 
> in BlockManager.java, so that the replication monitor executes as quickly as 
> possible and exacerbates the race. 
> To fix the test all that needs to be done is to turn off the replication 
> monitor. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10816) TestComputeInvalidateWork#testDatanodeReRegistration fails due to race between test and replication monitor

2017-06-01 Thread Eric Badger (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated HDFS-10816:
---
Status: Open  (was: Patch Available)

> TestComputeInvalidateWork#testDatanodeReRegistration fails due to race 
> between test and replication monitor
> ---
>
> Key: HDFS-10816
> URL: https://issues.apache.org/jira/browse/HDFS-10816
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10816.001.patch, HDFS-10816.002.patch, 
> HDFS-10816-branch-2.002.patch
>
>
> {noformat}
> java.lang.AssertionError: Expected invalidate blocks to be the number of DNs 
> expected:<3> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestComputeInvalidateWork.testDatanodeReRegistration(TestComputeInvalidateWork.java:160)
> {noformat}
> The test fails because of a race condition between the test and the 
> replication monitor. The default replication monitor interval is 3 seconds, 
> which is just about how long the test normally takes to run. The test deletes 
> a file and then subsequently gets the namesystem writelock. However, if the 
> replication monitor fires in between those two instructions, the test will 
> fail as it will itself invalidate one of the blocks. This can be easily 
> reproduced by removing the sleep() in the ReplicationMonitor's run() method 
> in BlockManager.java, so that the replication monitor executes as quickly as 
> possible and exacerbates the race. 
> To fix the test all that needs to be done is to turn off the replication 
> monitor. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11673) [READ] Handle failures of Datanode with PROVIDED storage

2017-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033774#comment-16033774
 ] 

Hadoop QA commented on HDFS-11673:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
35s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 13m 
31s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m 
55s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
16s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
37s{color} | {color:green} HDFS-9806 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
51s{color} | {color:green} HDFS-9806 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
31s{color} | {color:red} hadoop-tools/hadoop-fs2img in HDFS-9806 has 1 extant 
Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} HDFS-9806 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 13m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m  6s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
44s{color} | {color:green} hadoop-fs2img in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}156m  5s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.TestDFSRSDefault10x4StripedOutputStreamWithFailure |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11673 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870838/HDFS-11673-HDFS-9806.005.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0e09caad67cd 3.13.0-108-generic #155-Ubuntu SMP Wed Jan 11 
16:58:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-9806 / fc467d6 |
| Default Java | 1.8.0_131 |
| findbugs | v3.1.0-RC1 |
| findbugs | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19731/artifact/patchprocess/branch-findbugs-hadoop-tools_hadoop-fs2img-warnings.html
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19731/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 

[jira] [Updated] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-06-01 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-11856:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.7.4
   Status: Resolved  (was: Patch Available)

> Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline 
> updates
> --
>
> Key: HDFS-11856
> URL: https://issues.apache.org/jira/browse/HDFS-11856
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, rolling upgrades
>Affects Versions: 2.7.3
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HDFS-11856-01.patch, HDFS-11856-02.branch-2.patch, 
> HDFS-11856-02.patch, HDFS-11856-branch-2-02.patch, 
> HDFS-11856-branch-2.7-02.patch, HDFS-11856-branch-2.8-02.patch
>
>
> During rolling upgrade if the DN gets restarted, then it will send special 
> OOB_RESTART status to all streams opened for write.
> 1. Local clients will wait for 30 seconds to datanode to come back.
> 2. Remote clients will consider these nodes as bad nodes and continue with 
> pipeline recoveries and write. These restarted nodes will be considered as 
> bad, and will be excluded for lifetime of stream.
> In case of small cluster, where total nodes itself is 3, each time a remote 
> node restarts for upgrade, it will be excluded.
> So a stream writing to 3 nodes initial, will end-up writing to only one node 
> at the end, there are no other nodes to replace.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11856) Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline updates

2017-06-01 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033758#comment-16033758
 ] 

Kihwal Lee commented on HDFS-11856:
---

+1 for the branch-2.7 patch. I've just committed it.

> Ability to re-add Upgrading Nodes (remote) to pipeline for future pipeline 
> updates
> --
>
> Key: HDFS-11856
> URL: https://issues.apache.org/jira/browse/HDFS-11856
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, rolling upgrades
>Affects Versions: 2.7.3
>Reporter: Vinayakumar B
>Assignee: Vinayakumar B
> Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2
>
> Attachments: HDFS-11856-01.patch, HDFS-11856-02.branch-2.patch, 
> HDFS-11856-02.patch, HDFS-11856-branch-2-02.patch, 
> HDFS-11856-branch-2.7-02.patch, HDFS-11856-branch-2.8-02.patch
>
>
> During rolling upgrade if the DN gets restarted, then it will send special 
> OOB_RESTART status to all streams opened for write.
> 1. Local clients will wait for 30 seconds to datanode to come back.
> 2. Remote clients will consider these nodes as bad nodes and continue with 
> pipeline recoveries and write. These restarted nodes will be considered as 
> bad, and will be excluded for lifetime of stream.
> In case of small cluster, where total nodes itself is 3, each time a remote 
> node restarts for upgrade, it will be excluded.
> So a stream writing to 3 nodes initial, will end-up writing to only one node 
> at the end, there are no other nodes to replace.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11741) Long running balancer may fail due to expired DataEncryptionKey

2017-06-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033737#comment-16033737
 ] 

Hudson commented on HDFS-11741:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11813 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11813/])
HDFS-11741. Long running balancer may fail due to expired (xiao: rev 
6a3fc685a98718742c351ed6625dc7a4dee55e77)
* (add) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestKeyManager.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/security/token/block/BlockTokenSecretManager.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/KeyManager.java


> Long running balancer may fail due to expired DataEncryptionKey
> ---
>
> Key: HDFS-11741
> URL: https://issues.apache.org/jira/browse/HDFS-11741
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
> Environment: CDH5.8.2, Kerberos, Data transfer encryption enabled. 
> Balancer login using keytab
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2
>
> Attachments: block keys.png, HDFS-11741.001.patch, 
> HDFS-11741.002.patch, HDFS-11741.003.patch, HDFS-11741.004.patch, 
> HDFS-11741.005.patch, HDFS-11741.06.patch, HDFS-11741.07.patch, 
> HDFS-11741.08.patch, HDFS-11741.branch-2.01.patch
>
>
> We found a long running balancer may fail despite using keytab, because 
> KeyManager returns expired DataEncryptionKey, and it throws the following 
> exception:
> {noformat}
> 2017-04-30 05:03:58,661 WARN  [pool-1464-thread-10] balancer.Dispatcher 
> (Dispatcher.java:dispatch(325)) - Failed to move blk_1067352712_3913241 with 
> size=546650 from 10.0.0.134:50010:DISK to 10.0.0.98:50010:DISK through 
> 10.0.0.134:50010
> org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
> Can't re-compute encryption key for nonce, since the required block key 
> (keyID=1005215027) doesn't exist. Current key: 1005215030
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:417)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:474)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2300(Dispatcher.java:182)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:899)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This bug is similar in nature to HDFS-10609. While balancer KeyManager 
> actively synchronizes itself with NameNode w.r.t block keys, it does not 
> update DataEncryptionKey accordingly.
> In a specific cluster, with Kerberos ticket life time 10 hours, and default 
> block token expiration/life time 10 hours, a long running balancer failed 
> after 20~30 hours.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11912) Add a snapshot unit test with randomized file IO operations

2017-06-01 Thread George Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Huang updated HDFS-11912:

Status: Patch Available  (was: Open)

> Add a snapshot unit test with randomized file IO operations
> ---
>
> Key: HDFS-11912
> URL: https://issues.apache.org/jira/browse/HDFS-11912
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Reporter: George Huang
>Priority: Minor
> Attachments: HDFS-11912.001.patch
>
>
> Adding a snapshot unit test with randomized file IO operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11912) Add a snapshot unit test with randomized file IO operations

2017-06-01 Thread George Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

George Huang updated HDFS-11912:

Attachment: HDFS-11912.001.patch

> Add a snapshot unit test with randomized file IO operations
> ---
>
> Key: HDFS-11912
> URL: https://issues.apache.org/jira/browse/HDFS-11912
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs
>Reporter: George Huang
>Priority: Minor
> Attachments: HDFS-11912.001.patch
>
>
> Adding a snapshot unit test with randomized file IO operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11912) Add a snapshot unit test with randomized file IO operations

2017-06-01 Thread George Huang (JIRA)
George Huang created HDFS-11912:
---

 Summary: Add a snapshot unit test with randomized file IO 
operations
 Key: HDFS-11912
 URL: https://issues.apache.org/jira/browse/HDFS-11912
 Project: Hadoop HDFS
  Issue Type: Test
  Components: hdfs
Reporter: George Huang
Priority: Minor


Adding a snapshot unit test with randomized file IO operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11904) Reuse iip in unprotectedRemoveXAttrs calls

2017-06-01 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-11904:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha4
   2.9.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2. Thanks again, [~liuml07].

> Reuse iip in unprotectedRemoveXAttrs calls
> --
>
> Key: HDFS-11904
> URL: https://issues.apache.org/jira/browse/HDFS-11904
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.7.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Fix For: 2.9.0, 3.0.0-alpha4
>
> Attachments: HDFS-11904.01.patch
>
>
> In HDFS-10939, {{unprotectedSetXAttrs}} was optimized to use IIP instead of 
> path string.
> This jira is to do the same on {{unprotectedRemoveXAttrs}}.
> No performance test specifically for this is done yet, but it's not hard to 
> see a usage pattern of frequent removexattr could induce perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11904) Reuse iip in unprotectedRemoveXAttrs calls

2017-06-01 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-11904:
-
Target Version/s: 2.9.0  (was: 2.8.2)

> Reuse iip in unprotectedRemoveXAttrs calls
> --
>
> Key: HDFS-11904
> URL: https://issues.apache.org/jira/browse/HDFS-11904
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 2.7.0
>Reporter: Xiao Chen
>Assignee: Xiao Chen
> Attachments: HDFS-11904.01.patch
>
>
> In HDFS-10939, {{unprotectedSetXAttrs}} was optimized to use IIP instead of 
> path string.
> This jira is to do the same on {{unprotectedRemoveXAttrs}}.
> No performance test specifically for this is done yet, but it's not hard to 
> see a usage pattern of frequent removexattr could induce perf issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11741) Long running balancer may fail due to expired DataEncryptionKey

2017-06-01 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-11741:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.2
   3.0.0-alpha4
   2.9.0
   Status: Resolved  (was: Patch Available)

> Long running balancer may fail due to expired DataEncryptionKey
> ---
>
> Key: HDFS-11741
> URL: https://issues.apache.org/jira/browse/HDFS-11741
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
> Environment: CDH5.8.2, Kerberos, Data transfer encryption enabled. 
> Balancer login using keytab
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2
>
> Attachments: block keys.png, HDFS-11741.001.patch, 
> HDFS-11741.002.patch, HDFS-11741.003.patch, HDFS-11741.004.patch, 
> HDFS-11741.005.patch, HDFS-11741.06.patch, HDFS-11741.07.patch, 
> HDFS-11741.08.patch, HDFS-11741.branch-2.01.patch
>
>
> We found a long running balancer may fail despite using keytab, because 
> KeyManager returns expired DataEncryptionKey, and it throws the following 
> exception:
> {noformat}
> 2017-04-30 05:03:58,661 WARN  [pool-1464-thread-10] balancer.Dispatcher 
> (Dispatcher.java:dispatch(325)) - Failed to move blk_1067352712_3913241 with 
> size=546650 from 10.0.0.134:50010:DISK to 10.0.0.98:50010:DISK through 
> 10.0.0.134:50010
> org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
> Can't re-compute encryption key for nonce, since the required block key 
> (keyID=1005215027) doesn't exist. Current key: 1005215030
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:417)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:474)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2300(Dispatcher.java:182)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:899)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This bug is similar in nature to HDFS-10609. While balancer KeyManager 
> actively synchronizes itself with NameNode w.r.t block keys, it does not 
> update DataEncryptionKey accordingly.
> In a specific cluster, with Kerberos ticket life time 10 hours, and default 
> block token expiration/life time 10 hours, a long running balancer failed 
> after 20~30 hours.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11741) Long running balancer may fail due to expired DataEncryptionKey

2017-06-01 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033709#comment-16033709
 ] 

Xiao Chen commented on HDFS-11741:
--

Compiled and ran {{TestBlockToken}} & {{TestKeyManager}} locally on branch-2, 
passed.
Ran the failed tests reported by pre-commit on trunk, passed.

Committed this to trunk, branch-2, branch-2.8. Thanks [~jojochuang] for 
reporting and fixing the issue, and [~andrew.wang] [~yzhangal] for reviews!

> Long running balancer may fail due to expired DataEncryptionKey
> ---
>
> Key: HDFS-11741
> URL: https://issues.apache.org/jira/browse/HDFS-11741
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
> Environment: CDH5.8.2, Kerberos, Data transfer encryption enabled. 
> Balancer login using keytab
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: block keys.png, HDFS-11741.001.patch, 
> HDFS-11741.002.patch, HDFS-11741.003.patch, HDFS-11741.004.patch, 
> HDFS-11741.005.patch, HDFS-11741.06.patch, HDFS-11741.07.patch, 
> HDFS-11741.08.patch, HDFS-11741.branch-2.01.patch
>
>
> We found a long running balancer may fail despite using keytab, because 
> KeyManager returns expired DataEncryptionKey, and it throws the following 
> exception:
> {noformat}
> 2017-04-30 05:03:58,661 WARN  [pool-1464-thread-10] balancer.Dispatcher 
> (Dispatcher.java:dispatch(325)) - Failed to move blk_1067352712_3913241 with 
> size=546650 from 10.0.0.134:50010:DISK to 10.0.0.98:50010:DISK through 
> 10.0.0.134:50010
> org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
> Can't re-compute encryption key for nonce, since the required block key 
> (keyID=1005215027) doesn't exist. Current key: 1005215030
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:417)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:474)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2300(Dispatcher.java:182)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:899)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This bug is similar in nature to HDFS-10609. While balancer KeyManager 
> actively synchronizes itself with NameNode w.r.t block keys, it does not 
> update DataEncryptionKey accordingly.
> In a specific cluster, with Kerberos ticket life time 10 hours, and default 
> block token expiration/life time 10 hours, a long running balancer failed 
> after 20~30 hours.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11771) Ozone: KSM: Add checkVolumeAccess

2017-06-01 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033672#comment-16033672
 ] 

Anu Engineer commented on HDFS-11771:
-

{code}
String userName;
if (acl.getName() == null) {
  if (acl.getType() != OzoneAcl.OzoneACLType.WORLD) {
throw new IllegalArgumentException(
"Only world ACL Type can have null username");
  }
  userName = "world";
} else {
  userName = acl.getName();
}
{code}

I still feel this is wrong, if I have a user called *world*  then this will 
conflict with that. I think the right fix is to remove this check and if the 
user name is required due to protobuf then fix protobuf to make sure user name 
optional.


> Ozone: KSM:  Add checkVolumeAccess
> --
>
> Key: HDFS-11771
> URL: https://issues.apache.org/jira/browse/HDFS-11771
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11771-HDFS-7240.001.patch, 
> HDFS-11771-HDFS-7240.002.patch, HDFS-11771-HDFS-7240.003.patch, 
> HDFS-11771-HDFS-7240.004.patch, HDFS-11771-HDFS-7240.005.patch, 
> HDFS-11771-HDFS-7240.006.patch
>
>
> Checks if the caller has access to a given volume. This call supports the 
> ACLs specified in the ozone rest protocol documentation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11804) KMS client needs retry logic

2017-06-01 Thread Rushabh S Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033667#comment-16033667
 ] 

Rushabh S Shah commented on HDFS-11804:
---

bq. Correct. So we should treat AuthenticationException similar as 
AccessControlException and AuthorizationException, right?
Somehow I mixed AuthenticationException with AuthorizationException. my bad.
Yes will incorporate the change in next patch.
bq. Say for example we have 3 KMS's. I think we should retry immediately for 
all 3 at first.
Makes sense..The only issue I can see is the first time we calculate 
{{FailoverOnNetworkExceptionRetry.getFailoverOrRetrySleepTime(int times)}} will 
have {{times == #kms servers}} instead of gradually increasing the sleep time. 
But maybe its ok.
Will attach a new patch shortly incorporating all the comments.
Thanks again [~xiaochen] !

> KMS client needs retry logic
> 
>
> Key: HDFS-11804
> URL: https://issues.apache.org/jira/browse/HDFS-11804
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-11804-trunk-1.patch, HDFS-11804-trunk.patch
>
>
> The kms client appears to have no retry logic – at all.  It's completely 
> decoupled from the ipc retry logic.  This has major impacts if the KMS is 
> unreachable for any reason, including but not limited to network connection 
> issues, timeouts, the +restart during an upgrade+.
> This has some major ramifications:
> # Jobs may fail to submit, although oozie resubmit logic should mask it
> # Non-oozie launchers may experience higher rates if they do not already have 
> retry logic.
> # Tasks reading EZ files will fail, probably be masked by framework reattempts
> # EZ file creation fails after creating a 0-length file – client receives 
> EDEK in the create response, then fails when decrypting the EDEK
> # Bulk hadoop fs copies, and maybe distcp, will prematurely fail



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11894) Ozone: Cleanup imports

2017-06-01 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-11894:

  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: HDFS-7240
Target Version/s: HDFS-7240
  Status: Resolved  (was: Patch Available)

[~cheersyang] Thanks for the contribution. I have fixed the deprecated warning 
while committing to the feature branch. 

> Ozone: Cleanup imports
> --
>
> Key: HDFS-11894
> URL: https://issues.apache.org/jira/browse/HDFS-11894
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Weiwei Yang
>Priority: Trivial
> Fix For: HDFS-7240
>
> Attachments: HDFS-11894-HDFS-7240.001.patch
>
>
> As discussed in HDFS-11846, We have some imports like Nullable and 
> NonNullable that might be imported from packages which we don't intend to 
> take dependencies on. This JIRA tracks cleaning up those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-11873) Ozone: Object store handler cannot serve multiple requests from single http client

2017-06-01 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao reassigned HDFS-11873:
-

Assignee: Xiaoyu Yao  (was: Anu Engineer)

> Ozone: Object store handler cannot serve multiple requests from single http 
> client
> --
>
> Key: HDFS-11873
> URL: https://issues.apache.org/jira/browse/HDFS-11873
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: HDFS-7240
>Reporter: Weiwei Yang
>Assignee: Xiaoyu Yao
>Priority: Critical
> Attachments: HDFS-11873-HDFS-7240.testcase.patch
>
>
> This issue was found when I worked on HDFS-11846. Instead of creating a new 
> http client instance per request, I tried to reuse {{CloseableHttpClient}} in 
> {{OzoneClient}} class in a {{PoolingHttpClientConnectionManager}}. However, 
> every second request from the http client hangs, which could not get 
> dispatched to {{ObjectStoreJerseyContainer}}. There seems to be something 
> wrong in the netty pipeline, this jira aims to 1) fix the problem in the 
> server side 2) use the pool for client http clients to reduce the resource 
> overhead.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11771) Ozone: KSM: Add checkVolumeAccess

2017-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033565#comment-16033565
 ] 

Hadoop QA commented on HDFS-11771:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
32s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
29s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
37s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 2 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
50s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-7240 has 10 
extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} HDFS-7240 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
7s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
31s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 40s{color} | {color:orange} hadoop-hdfs-project: The patch generated 1 new + 
9 unchanged - 1 fixed = 10 total (was 10) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
16s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}121m  8s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}160m 50s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestReconstructStripedBlocks 
|
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.cblock.TestCBlockServer |
|   | hadoop.ozone.scm.TestContainerSQLCli |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure150 |
|   | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.server.namenode.TestCacheDirectives |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 |
|   | hadoop.cblock.TestBufferManager |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure110 |
| Timed out junit tests | org.apache.hadoop.cblock.TestLocalBlockCache |
\\
\\
|| 

[jira] [Commented] (HDFS-11804) KMS client needs retry logic

2017-06-01 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033551#comment-16033551
 ] 

Xiao Chen commented on HDFS-11804:
--

Thanks Rushabh.
bq. This means user doesn't have access to key. Even if we retry its going to 
fails anyways unless some servers are misconfigured.
Correct. So we should treat {{AuthenticationException}} similar as 
{{AccessControlException}} and {{AuthorizationException}}, right? :)

bq. Can you elaborate the context ?
Say for example we have 3 KMS's. I think we should retry immediately for all 3 
at first. (Same as current LBKMSCP behavior).
If multiplier > 1, we could retry with delays on the 2nd pass.

> KMS client needs retry logic
> 
>
> Key: HDFS-11804
> URL: https://issues.apache.org/jira/browse/HDFS-11804
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: Rushabh S Shah
>Assignee: Rushabh S Shah
> Attachments: HDFS-11804-trunk-1.patch, HDFS-11804-trunk.patch
>
>
> The kms client appears to have no retry logic – at all.  It's completely 
> decoupled from the ipc retry logic.  This has major impacts if the KMS is 
> unreachable for any reason, including but not limited to network connection 
> issues, timeouts, the +restart during an upgrade+.
> This has some major ramifications:
> # Jobs may fail to submit, although oozie resubmit logic should mask it
> # Non-oozie launchers may experience higher rates if they do not already have 
> retry logic.
> # Tasks reading EZ files will fail, probably be masked by framework reattempts
> # EZ file creation fails after creating a 0-length file – client receives 
> EDEK in the create response, then fails when decrypting the EDEK
> # Bulk hadoop fs copies, and maybe distcp, will prematurely fail



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11911) SnapshotDiff should maintain the order of file/dir creation and deletion

2017-06-01 Thread Manoj Govindassamy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033530#comment-16033530
 ] 

Manoj Govindassamy commented on HDFS-11911:
---

[~jingzhao], [~yzhangal], any thoughts on maintaining the order of diff entries?

> SnapshotDiff should maintain the order of file/dir creation and deletion
> 
>
> Key: HDFS-11911
> URL: https://issues.apache.org/jira/browse/HDFS-11911
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, snapshots
>Affects Versions: 3.0.0-alpha1
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>
> {{DirectoryWithSnapshotFeature}} maintains a separate list for CREATED and 
> DELETED children but the ordering of these creation and deletion events are 
> not maintained. Assume a case like below, where the time is growing 
> downwards...
> {noformat}
> |
> +  CREATE File-1
> |
> + Snap S1 created
> |
> + DELETE File-1
> |
> + Snap S2 created
> |
> + CREATE File-1
> |
> + Snap S3 created
> |
> |
> V
> {noformat} 
> The snapshot diff report which takes in the DirectoryWithSnapshotFeature diff 
> entries and just prints all the creation first and then the deletions, 
> thereby giving the perception that file-1 got created first and then got 
> deleted. But after S3, file-1 is still available. 
> {noformat}
> The difference between snapshot S1 and snapshot S3 under the directory /:
> M .
> + ./file-1
> - ./file-1
> {noformat}
> Can we have DirectoryWithSnapshotFeature maintain the diff entries ordered by 
> time or sequence? 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11911) SnapshotDiff should maintain the order of file/dir creation and deletion

2017-06-01 Thread Manoj Govindassamy (JIRA)
Manoj Govindassamy created HDFS-11911:
-

 Summary: SnapshotDiff should maintain the order of file/dir 
creation and deletion
 Key: HDFS-11911
 URL: https://issues.apache.org/jira/browse/HDFS-11911
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs, snapshots
Affects Versions: 3.0.0-alpha1
Reporter: Manoj Govindassamy
Assignee: Manoj Govindassamy


{{DirectoryWithSnapshotFeature}} maintains a separate list for CREATED and 
DELETED children but the ordering of these creation and deletion events are not 
maintained. Assume a case like below, where the time is growing downwards...
{noformat}
|
+  CREATE File-1
|
+ Snap S1 created
|
+ DELETE File-1
|
+ Snap S2 created
|
+ CREATE File-1
|
+ Snap S3 created
|
|
V
{noformat} 

The snapshot diff report which takes in the DirectoryWithSnapshotFeature diff 
entries and just prints all the creation first and then the deletions, thereby 
giving the perception that file-1 got created first and then got deleted. But 
after S3, file-1 is still available. 

{noformat}
The difference between snapshot S1 and snapshot S3 under the directory /:
M   .
+   ./file-1
-   ./file-1
{noformat}

Can we have DirectoryWithSnapshotFeature maintain the diff entries ordered by 
time or sequence? 




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11673) [READ] Handle failures of Datanode with PROVIDED storage

2017-06-01 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033513#comment-16033513
 ] 

Virajith Jalaparti commented on HDFS-11673:
---

Posting patch v5 which essentially rebases v4 on the latest 9806 branch. Will 
commit to branch once jenkins returns.

> [READ] Handle failures of Datanode with PROVIDED storage
> 
>
> Key: HDFS-11673
> URL: https://issues.apache.org/jira/browse/HDFS-11673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11673-HDFS-9806.001.patch, 
> HDFS-11673-HDFS-9806.002.patch, HDFS-11673-HDFS-9806.003.patch, 
> HDFS-11673-HDFS-9806.004.patch, HDFS-11673-HDFS-9806.005.patch
>
>
> Blocks on {{PROVIDED}} storage should become unavailable if and only if all 
> Datanodes that are configured with {{PROVIDED}} storage become unavailable. 
> Even if one Datanode with {{PROVIDED}} storage is available, all blocks on 
> the {{PROVIDED}} storage should be accessible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11673) [READ] Handle failures of Datanode with PROVIDED storage

2017-06-01 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11673:
--
Status: Patch Available  (was: Open)

> [READ] Handle failures of Datanode with PROVIDED storage
> 
>
> Key: HDFS-11673
> URL: https://issues.apache.org/jira/browse/HDFS-11673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11673-HDFS-9806.001.patch, 
> HDFS-11673-HDFS-9806.002.patch, HDFS-11673-HDFS-9806.003.patch, 
> HDFS-11673-HDFS-9806.004.patch, HDFS-11673-HDFS-9806.005.patch
>
>
> Blocks on {{PROVIDED}} storage should become unavailable if and only if all 
> Datanodes that are configured with {{PROVIDED}} storage become unavailable. 
> Even if one Datanode with {{PROVIDED}} storage is available, all blocks on 
> the {{PROVIDED}} storage should be accessible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11673) [READ] Handle failures of Datanode with PROVIDED storage

2017-06-01 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11673:
--
Status: Open  (was: Patch Available)

> [READ] Handle failures of Datanode with PROVIDED storage
> 
>
> Key: HDFS-11673
> URL: https://issues.apache.org/jira/browse/HDFS-11673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11673-HDFS-9806.001.patch, 
> HDFS-11673-HDFS-9806.002.patch, HDFS-11673-HDFS-9806.003.patch, 
> HDFS-11673-HDFS-9806.004.patch, HDFS-11673-HDFS-9806.005.patch
>
>
> Blocks on {{PROVIDED}} storage should become unavailable if and only if all 
> Datanodes that are configured with {{PROVIDED}} storage become unavailable. 
> Even if one Datanode with {{PROVIDED}} storage is available, all blocks on 
> the {{PROVIDED}} storage should be accessible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11673) [READ] Handle failures of Datanode with PROVIDED storage

2017-06-01 Thread Virajith Jalaparti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Virajith Jalaparti updated HDFS-11673:
--
Attachment: HDFS-11673-HDFS-9806.005.patch

> [READ] Handle failures of Datanode with PROVIDED storage
> 
>
> Key: HDFS-11673
> URL: https://issues.apache.org/jira/browse/HDFS-11673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-11673-HDFS-9806.001.patch, 
> HDFS-11673-HDFS-9806.002.patch, HDFS-11673-HDFS-9806.003.patch, 
> HDFS-11673-HDFS-9806.004.patch, HDFS-11673-HDFS-9806.005.patch
>
>
> Blocks on {{PROVIDED}} storage should become unavailable if and only if all 
> Datanodes that are configured with {{PROVIDED}} storage become unavailable. 
> Even if one Datanode with {{PROVIDED}} storage is available, all blocks on 
> the {{PROVIDED}} storage should be accessible.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11741) Long running balancer may fail due to expired DataEncryptionKey

2017-06-01 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033506#comment-16033506
 ] 

Xiao Chen commented on HDFS-11741:
--

Turns out YETUS-515 is the same as HADOOP-14474. Commented there to see if we 
can unblock branch-2 soon.
Will manually compile and run related tests if it's not done by end of today, 
if no objections.

> Long running balancer may fail due to expired DataEncryptionKey
> ---
>
> Key: HDFS-11741
> URL: https://issues.apache.org/jira/browse/HDFS-11741
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
> Environment: CDH5.8.2, Kerberos, Data transfer encryption enabled. 
> Balancer login using keytab
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: block keys.png, HDFS-11741.001.patch, 
> HDFS-11741.002.patch, HDFS-11741.003.patch, HDFS-11741.004.patch, 
> HDFS-11741.005.patch, HDFS-11741.06.patch, HDFS-11741.07.patch, 
> HDFS-11741.08.patch, HDFS-11741.branch-2.01.patch
>
>
> We found a long running balancer may fail despite using keytab, because 
> KeyManager returns expired DataEncryptionKey, and it throws the following 
> exception:
> {noformat}
> 2017-04-30 05:03:58,661 WARN  [pool-1464-thread-10] balancer.Dispatcher 
> (Dispatcher.java:dispatch(325)) - Failed to move blk_1067352712_3913241 with 
> size=546650 from 10.0.0.134:50010:DISK to 10.0.0.98:50010:DISK through 
> 10.0.0.134:50010
> org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
> Can't re-compute encryption key for nonce, since the required block key 
> (keyID=1005215027) doesn't exist. Current key: 1005215030
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:417)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:474)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2300(Dispatcher.java:182)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:899)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This bug is similar in nature to HDFS-10609. While balancer KeyManager 
> actively synchronizes itself with NameNode w.r.t block keys, it does not 
> update DataEncryptionKey accordingly.
> In a specific cluster, with Kerberos ticket life time 10 hours, and default 
> block token expiration/life time 10 hours, a long running balancer failed 
> after 20~30 hours.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11894) Ozone: Cleanup imports

2017-06-01 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033505#comment-16033505
 ] 

Anu Engineer commented on HDFS-11894:
-

I will commit this shortly. The findbugs warnings are not from ozone code. The 
javac deprecation warning seems real. I will explore if we have any way to fix 
that, if so I will file another JIRA. Test failures are not related to this 
patch.


> Ozone: Cleanup imports
> --
>
> Key: HDFS-11894
> URL: https://issues.apache.org/jira/browse/HDFS-11894
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Weiwei Yang
>Priority: Trivial
> Attachments: HDFS-11894-HDFS-7240.001.patch
>
>
> As discussed in HDFS-11846, We have some imports like Nullable and 
> NonNullable that might be imported from packages which we don't intend to 
> take dependencies on. This JIRA tracks cleaning up those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11894) Ozone: Cleanup imports

2017-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033497#comment-16033497
 ] 

Hadoop QA commented on HDFS-11894:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
 6s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
33s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
43s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
34s{color} | {color:green} HDFS-7240 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
30s{color} | {color:green} HDFS-7240 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
36s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in HDFS-7240 
has 2 extant Findbugs warnings. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
52s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in HDFS-7240 has 10 
extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
34s{color} | {color:green} HDFS-7240 passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 32s{color} 
| {color:red} hadoop-hdfs-project generated 1 new + 55 unchanged - 0 fixed = 56 
total (was 55) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
15s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 57s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}107m 40s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure010 |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.cblock.TestCBlockCLI |
|   | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
|   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:14b5c93 |
| JIRA Issue | HDFS-11894 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870717/HDFS-11894-HDFS-7240.001.patch
 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  

[jira] [Updated] (HDFS-11383) String duplication in org.apache.hadoop.fs.BlockLocation

2017-06-01 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HDFS-11383:
--

Hi Andrew, I think this time I've addressed all your concerns and there is
nothing from findbugs or checkstyle. Could you please merge this patch?


On Wed, May 31, 2017 at 11:00 AM, Andrew Wang (JIRA) 



> String duplication in org.apache.hadoop.fs.BlockLocation
> 
>
> Key: HDFS-11383
> URL: https://issues.apache.org/jira/browse/HDFS-11383
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
> Attachments: HDFS-11383.01.patch, HDFS-11383.02.patch, 
> HDFS-11383.03.patch, HDFS-11383.04.patch, hs2-crash-2.txt
>
>
> I am working on Hive performance, investigating the problem of high memory 
> pressure when (a) a table consists of a high number (thousands) of partitions 
> and (b) multiple queries run against it concurrently. It turns out that a lot 
> of memory is wasted due to data duplication. One source of duplicate strings 
> is class org.apache.hadoop.fs.BlockLocation. Its fields such as storageIds, 
> topologyPaths, hosts, names, may collectively use up to 6% of memory in my 
> benchmark, causing (together with other problematic classes) a huge memory 
> spike. Of these 6% of memory taken by BlockLocation strings, more than 5% are 
> wasted due to duplication.
> I think we need to add calls to String.intern() in the BlockLocation 
> constructor, like:
> {code}
> this.hosts = internStringsInArray(hosts);
> ...
> private void internStringsInArray(String[] sar) {
>   for (int i = 0; i < sar.length; i++) {
> sar[i] = sar[i].intern();
>   }
> }
> {code}
> String.intern() performs very well starting from JDK 7. I've found some 
> articles explaining the progress that was made by the HotSpot JVM developers 
> in this area, verified that with benchmarks myself, and finally added quite a 
> bit of interning to one of the Cloudera products without any issues.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11741) Long running balancer may fail due to expired DataEncryptionKey

2017-06-01 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033444#comment-16033444
 ] 

Xiao Chen commented on HDFS-11741:
--

INFRA-14261 is fixed, but YETUS-515 surfaced

> Long running balancer may fail due to expired DataEncryptionKey
> ---
>
> Key: HDFS-11741
> URL: https://issues.apache.org/jira/browse/HDFS-11741
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
> Environment: CDH5.8.2, Kerberos, Data transfer encryption enabled. 
> Balancer login using keytab
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: block keys.png, HDFS-11741.001.patch, 
> HDFS-11741.002.patch, HDFS-11741.003.patch, HDFS-11741.004.patch, 
> HDFS-11741.005.patch, HDFS-11741.06.patch, HDFS-11741.07.patch, 
> HDFS-11741.08.patch, HDFS-11741.branch-2.01.patch
>
>
> We found a long running balancer may fail despite using keytab, because 
> KeyManager returns expired DataEncryptionKey, and it throws the following 
> exception:
> {noformat}
> 2017-04-30 05:03:58,661 WARN  [pool-1464-thread-10] balancer.Dispatcher 
> (Dispatcher.java:dispatch(325)) - Failed to move blk_1067352712_3913241 with 
> size=546650 from 10.0.0.134:50010:DISK to 10.0.0.98:50010:DISK through 
> 10.0.0.134:50010
> org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
> Can't re-compute encryption key for nonce, since the required block key 
> (keyID=1005215027) doesn't exist. Current key: 1005215030
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:417)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:474)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2300(Dispatcher.java:182)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:899)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This bug is similar in nature to HDFS-10609. While balancer KeyManager 
> actively synchronizes itself with NameNode w.r.t block keys, it does not 
> update DataEncryptionKey accordingly.
> In a specific cluster, with Kerberos ticket life time 10 hours, and default 
> block token expiration/life time 10 hours, a long running balancer failed 
> after 20~30 hours.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11741) Long running balancer may fail due to expired DataEncryptionKey

2017-06-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033424#comment-16033424
 ] 

Hadoop QA commented on HDFS-11741:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
18s{color} | {color:red} Docker failed to build yetus/hadoop:8515d35. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-11741 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12870697/HDFS-11741.branch-2.01.patch
 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/19730/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Long running balancer may fail due to expired DataEncryptionKey
> ---
>
> Key: HDFS-11741
> URL: https://issues.apache.org/jira/browse/HDFS-11741
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
> Environment: CDH5.8.2, Kerberos, Data transfer encryption enabled. 
> Balancer login using keytab
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: block keys.png, HDFS-11741.001.patch, 
> HDFS-11741.002.patch, HDFS-11741.003.patch, HDFS-11741.004.patch, 
> HDFS-11741.005.patch, HDFS-11741.06.patch, HDFS-11741.07.patch, 
> HDFS-11741.08.patch, HDFS-11741.branch-2.01.patch
>
>
> We found a long running balancer may fail despite using keytab, because 
> KeyManager returns expired DataEncryptionKey, and it throws the following 
> exception:
> {noformat}
> 2017-04-30 05:03:58,661 WARN  [pool-1464-thread-10] balancer.Dispatcher 
> (Dispatcher.java:dispatch(325)) - Failed to move blk_1067352712_3913241 with 
> size=546650 from 10.0.0.134:50010:DISK to 10.0.0.98:50010:DISK through 
> 10.0.0.134:50010
> org.apache.hadoop.hdfs.protocol.datatransfer.InvalidEncryptionKeyException: 
> Can't re-compute encryption key for nonce, since the required block key 
> (keyID=1005215027) doesn't exist. Current key: 1005215030
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessageAndNegotiatedCipherOption(DataTransferSaslUtil.java:417)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.doSaslHandshake(SaslDataTransferClient.java:474)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.getEncryptedStreams(SaslDataTransferClient.java:299)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.send(SaslDataTransferClient.java:242)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.checkTrustAndSend(SaslDataTransferClient.java:211)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient.socketSend(SaslDataTransferClient.java:183)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.dispatch(Dispatcher.java:311)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$PendingMove.access$2300(Dispatcher.java:182)
> at 
> org.apache.hadoop.hdfs.server.balancer.Dispatcher$1.run(Dispatcher.java:899)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> This bug is similar in nature to HDFS-10609. While balancer KeyManager 
> actively synchronizes itself with NameNode w.r.t block keys, it does not 
> update DataEncryptionKey accordingly.
> In a specific cluster, with Kerberos ticket life time 10 hours, and default 
> block token expiration/life time 10 hours, a long running balancer failed 
> after 20~30 hours.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11822) Block Storage: Fix TestCBlockCLI, failing because of " Address already in use"

2017-06-01 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033405#comment-16033405
 ] 

Chen Liang commented on HDFS-11822:
---

Thanks [~msingh] for the update, +1 for v003 patch, pending Jenkins.

> Block Storage: Fix TestCBlockCLI, failing because of " Address already in use"
> --
>
> Key: HDFS-11822
> URL: https://issues.apache.org/jira/browse/HDFS-11822
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11822-HDFS-7240.001.patch, 
> HDFS-11822-HDFS-7240.002.patch, HDFS-11822-HDFS-7240.003.patch
>
>
> TestCBlockCLI is failing because of bind error.
> https://builds.apache.org/job/PreCommit-HDFS-Build/19429/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
> {code}
> org.apache.hadoop.cblock.TestCBlockCLI  Time elapsed: 0.668 sec  <<< ERROR!
> java.net.BindException: Problem binding to [0.0.0.0:9810] 
> java.net.BindException: Address already in use; For more details see:  
> http://wiki.apache.org/hadoop/BindException
>   at sun.nio.ch.Net.bind0(Native Method)
>   at sun.nio.ch.Net.bind(Net.java:433)
>   at sun.nio.ch.Net.bind(Net.java:425)
>   at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
>   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
>   at org.apache.hadoop.ipc.Server.bind(Server.java:543)
>   at org.apache.hadoop.ipc.Server$Listener.(Server.java:1033)
>   at org.apache.hadoop.ipc.Server.(Server.java:2791)
>   at org.apache.hadoop.ipc.RPC$Server.(RPC.java:960)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:420)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:341)
>   at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:802)
>   at 
> org.apache.hadoop.cblock.CBlockManager.startRpcServer(CBlockManager.java:215)
>   at org.apache.hadoop.cblock.CBlockManager.(CBlockManager.java:131)
>   at org.apache.hadoop.cblock.TestCBlockCLI.setup(TestCBlockCLI.java:57)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-5042) Completed files lost after power failure

2017-06-01 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-5042:
-
Fix Version/s: 2.8.2
   2.7.4

> Completed files lost after power failure
> 
>
> Key: HDFS-5042
> URL: https://issues.apache.org/jira/browse/HDFS-5042
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: ext3 on CentOS 5.7 (kernel 2.6.18-274.el5)
>Reporter: Dave Latham
>Assignee: Vinayakumar B
>Priority: Critical
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HDFS-5042-01.patch, HDFS-5042-02.patch, 
> HDFS-5042-03.patch, HDFS-5042-04.patch, HDFS-5042-05-branch-2.patch, 
> HDFS-5042-05.patch, HDFS-5042-branch-2-01.patch, HDFS-5042-branch-2-05.patch, 
> HDFS-5042-branch-2.7-05.patch, HDFS-5042-branch-2.7-06.patch, 
> HDFS-5042-branch-2.8-05.patch, HDFS-5042-branch-2.8-06.patch
>
>
> We suffered a cluster wide power failure after which HDFS lost data that it 
> had acknowledged as closed and complete.
> The client was HBase which compacted a set of HFiles into a new HFile, then 
> after closing the file successfully, deleted the previous versions of the 
> file.  The cluster then lost power, and when brought back up the newly 
> created file was marked CORRUPT.
> Based on reading the logs it looks like the replicas were created by the 
> DataNodes in the 'blocksBeingWritten' directory.  Then when the file was 
> closed they were moved to the 'current' directory.  After the power cycle 
> those replicas were again in the blocksBeingWritten directory of the 
> underlying file system (ext3).  When those DataNodes reported in to the 
> NameNode it deleted those replicas and lost the file.
> Some possible fixes could be having the DataNode fsync the directory(s) after 
> moving the block from blocksBeingWritten to current to ensure the rename is 
> durable or having the NameNode accept replicas from blocksBeingWritten under 
> certain circumstances.
> Log snippets from RS (RegionServer), NN (NameNode), DN (DataNode):
> {noformat}
> RS 2013-06-29 11:16:06,812 DEBUG org.apache.hadoop.hbase.util.FSUtils: 
> Creating 
> file=hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  with permission=rwxrwxrwx
> NN 2013-06-29 11:16:06,830 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.allocateBlock: 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c.
>  blk_1395839728632046111_357084589
> DN 2013-06-29 11:16:06,832 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block 
> blk_1395839728632046111_357084589 src: /10.0.5.237:14327 dest: 
> /10.0.5.237:50010
> NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.6.1:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.6.24:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.5.237:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> DN 2013-06-29 11:16:11,385 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Received block 
> blk_1395839728632046111_357084589 of size 25418340 from /10.0.5.237:14327
> DN 2013-06-29 11:16:11,385 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for block 
> blk_1395839728632046111_357084589 terminating
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: Removing 
> lease on  file 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  from client DFSClient_hb_rs_hs745,60020,1372470111932
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.completeFile: file 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  is closed by DFSClient_hb_rs_hs745,60020,1372470111932
> RS 2013-06-29 11:16:11,393 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Renaming compacted file at 
> hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  to 
> hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/n/6e0cc30af6e64e56ba5a539fdf159c4c
> RS 2013-06-29 11:16:11,505 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Completed major compaction of 7 file(s) in n of 
> users-6,\x12\xBDp\xA3,1359426311784.b5b0820cde759ae68e333b2f4015bb7e. into 
> 6e0cc30af6e64e56ba5a539fdf159c4c, size=24.2m; total size for store is 24.2m
> ---  CRASH, RESTART -
> NN 2013-06-29 

[jira] [Commented] (HDFS-5042) Completed files lost after power failure

2017-06-01 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033367#comment-16033367
 ] 

Kihwal Lee commented on HDFS-5042:
--

The patches look good. Now the fix is in branch-2.8 and branch-2.7.  Thanks for 
fixing this Vinay.

> Completed files lost after power failure
> 
>
> Key: HDFS-5042
> URL: https://issues.apache.org/jira/browse/HDFS-5042
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: ext3 on CentOS 5.7 (kernel 2.6.18-274.el5)
>Reporter: Dave Latham
>Assignee: Vinayakumar B
>Priority: Critical
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha4, 2.8.2
>
> Attachments: HDFS-5042-01.patch, HDFS-5042-02.patch, 
> HDFS-5042-03.patch, HDFS-5042-04.patch, HDFS-5042-05-branch-2.patch, 
> HDFS-5042-05.patch, HDFS-5042-branch-2-01.patch, HDFS-5042-branch-2-05.patch, 
> HDFS-5042-branch-2.7-05.patch, HDFS-5042-branch-2.7-06.patch, 
> HDFS-5042-branch-2.8-05.patch, HDFS-5042-branch-2.8-06.patch
>
>
> We suffered a cluster wide power failure after which HDFS lost data that it 
> had acknowledged as closed and complete.
> The client was HBase which compacted a set of HFiles into a new HFile, then 
> after closing the file successfully, deleted the previous versions of the 
> file.  The cluster then lost power, and when brought back up the newly 
> created file was marked CORRUPT.
> Based on reading the logs it looks like the replicas were created by the 
> DataNodes in the 'blocksBeingWritten' directory.  Then when the file was 
> closed they were moved to the 'current' directory.  After the power cycle 
> those replicas were again in the blocksBeingWritten directory of the 
> underlying file system (ext3).  When those DataNodes reported in to the 
> NameNode it deleted those replicas and lost the file.
> Some possible fixes could be having the DataNode fsync the directory(s) after 
> moving the block from blocksBeingWritten to current to ensure the rename is 
> durable or having the NameNode accept replicas from blocksBeingWritten under 
> certain circumstances.
> Log snippets from RS (RegionServer), NN (NameNode), DN (DataNode):
> {noformat}
> RS 2013-06-29 11:16:06,812 DEBUG org.apache.hadoop.hbase.util.FSUtils: 
> Creating 
> file=hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  with permission=rwxrwxrwx
> NN 2013-06-29 11:16:06,830 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.allocateBlock: 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c.
>  blk_1395839728632046111_357084589
> DN 2013-06-29 11:16:06,832 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block 
> blk_1395839728632046111_357084589 src: /10.0.5.237:14327 dest: 
> /10.0.5.237:50010
> NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.6.1:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> NN 2013-06-29 11:16:11,370 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.6.24:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.addStoredBlock: blockMap updated: 10.0.5.237:50010 is added to 
> blk_1395839728632046111_357084589 size 25418340
> DN 2013-06-29 11:16:11,385 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: Received block 
> blk_1395839728632046111_357084589 of size 25418340 from /10.0.5.237:14327
> DN 2013-06-29 11:16:11,385 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for block 
> blk_1395839728632046111_357084589 terminating
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: Removing 
> lease on  file 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  from client DFSClient_hb_rs_hs745,60020,1372470111932
> NN 2013-06-29 11:16:11,385 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.completeFile: file 
> /hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  is closed by DFSClient_hb_rs_hs745,60020,1372470111932
> RS 2013-06-29 11:16:11,393 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Renaming compacted file at 
> hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/.tmp/6e0cc30af6e64e56ba5a539fdf159c4c
>  to 
> hdfs://hm3:9000/hbase/users-6/b5b0820cde759ae68e333b2f4015bb7e/n/6e0cc30af6e64e56ba5a539fdf159c4c
> RS 2013-06-29 11:16:11,505 INFO org.apache.hadoop.hbase.regionserver.Store: 
> Completed major compaction of 7 file(s) in n of 
> users-6,\x12\xBDp\xA3,1359426311784.b5b0820cde759ae68e333b2f4015bb7e. into 
> 6e0cc30af6e64e56ba5a539fdf159c4c, size=24.2m; 

[jira] [Comment Edited] (HDFS-11909) Ozone: KSM : Support for simulated file system operations

2017-06-01 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033328#comment-16033328
 ] 

Anu Engineer edited comment on HDFS-11909 at 6/1/17 5:28 PM:
-

bq. The path name is not allowed to contain extra slashes, or other special 
character, we should prevent users from creating such kind of path name.
We certainly could that or for the purpose of these API we will treat one or 
more / as a single slash, either one will work.



was (Author: anu):
bq. The path name is not allowed to contain extra slashes, or other special 
character, we should prevent users from creating such kind of path name.
We certainly could or for the purpose of these API we will treat one or more / 
as a single slash, either one will work.


> Ozone: KSM :  Support for simulated file system operations
> --
>
> Key: HDFS-11909
> URL: https://issues.apache.org/jira/browse/HDFS-11909
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: simulation-file-system.pdf
>
>
> This JIRA adds a proposal that makes it easy to implement OzoneFileSystem. 
> This allows the directory and file list operations simpler.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11909) Ozone: KSM : Support for simulated file system operations

2017-06-01 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033328#comment-16033328
 ] 

Anu Engineer commented on HDFS-11909:
-

bq. The path name is not allowed to contain extra slashes, or other special 
character, we should prevent users from creating such kind of path name.
We certainly could or for the purpose of these API we will treat one or more / 
as a single slash, either one will work.


> Ozone: KSM :  Support for simulated file system operations
> --
>
> Key: HDFS-11909
> URL: https://issues.apache.org/jira/browse/HDFS-11909
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Anu Engineer
> Attachments: simulation-file-system.pdf
>
>
> This JIRA adds a proposal that makes it easy to implement OzoneFileSystem. 
> This allows the directory and file list operations simpler.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11771) Ozone: KSM: Add checkVolumeAccess

2017-06-01 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033318#comment-16033318
 ] 

Anu Engineer commented on HDFS-11771:
-

Force building the patch -- 
https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HDFS-Build/19728/

> Ozone: KSM:  Add checkVolumeAccess
> --
>
> Key: HDFS-11771
> URL: https://issues.apache.org/jira/browse/HDFS-11771
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Mukul Kumar Singh
> Attachments: HDFS-11771-HDFS-7240.001.patch, 
> HDFS-11771-HDFS-7240.002.patch, HDFS-11771-HDFS-7240.003.patch, 
> HDFS-11771-HDFS-7240.004.patch, HDFS-11771-HDFS-7240.005.patch, 
> HDFS-11771-HDFS-7240.006.patch
>
>
> Checks if the caller has access to a given volume. This call supports the 
> ACLs specified in the ozone rest protocol documentation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11894) Ozone: Cleanup imports

2017-06-01 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16033320#comment-16033320
 ] 

Anu Engineer commented on HDFS-11894:
-

Force building the patch -- 
https://builds.apache.org/view/PreCommit%20Builds/job/PreCommit-HDFS-Build/19729/

> Ozone: Cleanup imports
> --
>
> Key: HDFS-11894
> URL: https://issues.apache.org/jira/browse/HDFS-11894
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Anu Engineer
>Assignee: Weiwei Yang
>Priority: Trivial
> Attachments: HDFS-11894-HDFS-7240.001.patch
>
>
> As discussed in HDFS-11846, We have some imports like Nullable and 
> NonNullable that might be imported from packages which we don't intend to 
> take dependencies on. This JIRA tracks cleaning up those.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >