date:20150428


[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517080#comment-14517080
 ] 

Yi Liu edited comment on HDFS-7348 at 4/28/15 2:20 PM:
---

{noformat}

Recover one or more missed striped block in the
striped block group, the minimum number of live striped blocks should be
no less than data block number.

 | - Striped Block Group - |
  blk_0  blk_1   blk_2(*)   blk_3   ...   - A striped block group
|  |   |  |  
v  v   v  v 
 +--+   +--+   +--+   +--+
 |cell_0|   |cell_1|   |cell_2|   |cell_3|  ...   - The striped cell group 
(cell_0, cell_1, ...)
 +--+   +--+   +--+   +--+ 
 |cell_4|   |cell_5|   |cell_6|   |cell_7|  ...
 +--+   +--+   +--+   +--+
 |cell_8|   |cell_9|   |cell10|   |cell11|  ...
 +--+   +--+   +--+   +--+
  ... ...   ... ...
  
 
 We use following steps to recover striped cell group sequentially:
 step1: read minimum striped cells required by recovery.
 step2: decode cells for targets.
 step3: transfer cells to targets.
 
 In step1, try to read minimum striped cells, if there is corrupt or 
 stale sources, read from new source will be scheduled. The best sources
 are remembered for next round and may be updated in each round.
 
 In step2, It's blocked by HADOOP-11847, currently we only fill 1...
 to target block for test. Typically if source blocks we read are all data 
 blocks, we need to call encode, and if there is one parity block, we need 
 to call decode. Notice we only read once and recover all missed striped 
 block if they are more than one.
 
 In step3, send the recovered cells to targets by constructing packet 
 and send them directly. Same as continuous block replication, we 
 don't check the packet ack. Since the datanode doing the recovery work
 are one of the source datanodes, so the recovered cells are sent 
 remotely.
 
 There are some points we can do further improvements in next phase:
 1. we can read the block file directly on the local datanode, 
currently we use remote block reader. (Notice short-circuit is not
a good choice, see inline comments).
 2. We need to check the packet ack for EC recovery? Since EC recovery
is more expensive than continuous block replication, it needs to 
read from several other datanodes, should we make sure the 
recovered result received by targets? 
{noformat}


was (Author: hitliuyi):
{noformat}

DataRecoveryAndTransfer recover one or more missed striped block in the
striped block group, the minimum number of live striped blocks should be
no less than data block number.

 | - Striped Block Group - |
  blk_0  blk_1   blk_2(*)   blk_3   ...   - A striped block group
|  |   |  |  
v  v   v  v 
 +--+   +--+   +--+   +--+
 |cell_0|   |cell_1|   |cell_2|   |cell_3|  ...   - The striped cell group 
(cell_0, cell_1, ...)
 +--+   +--+   +--+   +--+ 
 |cell_4|   |cell_5|   |cell_6|   |cell_7|  ...
 +--+   +--+   +--+   +--+
 |cell_8|   |cell_9|   |cell10|   |cell11|  ...
 +--+   +--+   +--+   +--+
  ... ...   ... ...
  
 
 We use following steps to recover striped cell group sequentially:
 step1: read minimum striped cells required by recovery.
 step2: decode cells for targets.
 step3: transfer cells to targets.
 
 In step1, try to read minimum striped cells, if there is corrupt or 
 stale sources, read from new source will be scheduled. The best sources
 are remembered for next round and may be updated in each round.
 
 In step2, It's blocked by HADOOP-11847, currently we only fill 1...
 to target block for test. Typically if source blocks we read are all data 
 blocks, we need to call encode, and if there is one parity block, we need 
 to call decode. Notice we only read once and recover all missed striped 
 block if they are more than one.
 
 In step3, send the recovered cells to targets by constructing packet 
 and send them directly. Same as continuous block replication, we 
 don't check the packet ack. Since the datanode doing the recovery work
 are one of the source datanodes, so the recovered cells are sent 
 remotely.
 
 There are some points we can do further improvements in next phase:
 1. we can read the block file directly on the local datanode, 
currently we use remote block reader. (Notice short-circuit is not
a good choice, see inline comments).
 2. We need to check the packet ack for EC recovery? Since EC recovery
is more expensive than continuous block replication, it needs to 
read from several other datanodes, should we make sure the 
recovered result received by targets? 
{noformat}

 Erasure Coding: striped block recovery

[jira] [Updated] (HDFS-7348) Erasure Coding: striped block recovery


 [ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7348:
-
Description: This JIRA is to recover one or more missed striped block in 
the striped block group.  (was: This assumes the facilities like block reader 
and writer are ready, implements and performs erasure decoding/recovery work in 
*stripping* case utilizing erasure codec and coder provided by the codec 
framework.)

 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This JIRA is to recover one or more missed striped block in the striped block 
 group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip

2015-04-28 Thread Binglin Chang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516940#comment-14516940
 ] 

Binglin Chang commented on HDFS-5574:
-

Strange, the test error is caused by NoSuchMethodError, which should not happen 
if code is compiled successfully, is there any bug in test-patch process?

{code}
java.lang.NoSuchMethodError: 
org.apache.hadoop.fs.FSInputChecker.readAndDiscard(I)I
at 
org.apache.hadoop.hdfs.RemoteBlockReader.read(RemoteBlockReader.java:128)
at 
org.apache.hadoop.hdfs.DFSInputStream$ByteArrayStrategy.doRead(DFSInputStream.java:740)
at 
org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:796)
at 
org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:856)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:899)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:700)
at 
org.apache.hadoop.hdfs.TestDFSInputStream.testSkipInner(TestDFSInputStream.java:61)
at 
org.apache.hadoop.hdfs.TestDFSInputStream.testSkipWithRemoteBlockReader(TestDFSInputStream.java:76)
{code}

 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery


[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517087#comment-14517087
 ] 

Yi Liu commented on HDFS-7348:
--

{{testFileBlocksRecovery}} is to test the file blocks recovery:
 1. Check the replica is recovered in the target datanode, 
and verify the block replica length, generationStamp and content.
 2. Read the file and verify content. 

The decode is blocked by HADOOP-11847, will update the test to read file and 
verify content after recovery. Currently we fill the block replica with 
1... for test.

 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This JIRA is to recover one or more missed striped block in the striped block 
 group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8268) Port conflict log for data node server is not sufficient

2015-04-28 Thread Mohammad Shahid Khan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated HDFS-8268:
---
Attachment: HDFS-8268

i thought the solution should be as per the attached patch.
Please review the same.

 Port conflict log for data node server is not sufficient
 

 Key: HDFS-8268
 URL: https://issues.apache.org/jira/browse/HDFS-8268
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.0, 2.8.0
 Environment: x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan
Priority: Minor
 Attachments: HDFS-8268

   Original Estimate: 24h
  Remaining Estimate: 24h

 Data Node Server start up issue due to port conflict.
 The data node server port dfs.datanode.http.address conflict is not 
 sufficient to  identify the reason of failure.
 The exception log by the server is as below
 *Actual:*
 2015-04-27 16:48:53,960 FATAL 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
 java.net.BindException: Address already in use
   at sun.nio.ch.Net.bind0(Native Method)
   at sun.nio.ch.Net.bind(Net.java:437)
   at sun.nio.ch.Net.bind(Net.java:429)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
   at 
 io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
   at 
 io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
   at 
 io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
   at 
 io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455)
   at 
 io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440)
   at 
 io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
   at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194)
   at 
 io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340)
   at 
 io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
   at 
 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
   at 
 io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
   at java.lang.Thread.run(Thread.java:745)
 *_The above log does not contain the information of the conflicting port._*
 *Expected output:*
 java.net.BindException: Problem binding to [0.0.0.0:50075] 
 java.net.BindException: Address already in use; For more details see:  
 http://wiki.apache.org/hadoop/BindException
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:721)
   at 
 org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.start(DatanodeHttpServer.java:160)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:795)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1142)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:439)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2420)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2298)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2349)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2540)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2564)
 Caused by: java.net.BindException: Address already in use
   at sun.nio.ch.Net.bind0(Native Method)
   at sun.nio.ch.Net.bind(Net.java:437)
   at sun.nio.ch.Net.bind(Net.java:429)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
   at 
 io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
   at 
 io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)

[jira] [Updated] (HDFS-7348) Erasure Coding: striped block recovery


 [ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7348:
-
Summary: Erasure Coding: striped block recovery  (was: Erasure Coding: 
perform stripping erasure decoding/recovery work given block reader and writer)

 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This assumes the facilities like block reader and writer are ready, 
 implements and performs erasure decoding/recovery work in *stripping* case 
 utilizing erasure codec and coder provided by the codec framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery


[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517080#comment-14517080
 ] 

Yi Liu commented on HDFS-7348:
--

{noformat}

DataRecoveryAndTransfer recover one or more missed striped block in the
striped block group, the minimum number of live striped blocks should be
no less than data block number.

 | - Striped Block Group - |
  blk_0  blk_1   blk_2(*)   blk_3   ...   - A striped block group
|  |   |  |  
v  v   v  v 
 +--+   +--+   +--+   +--+
 |cell_0|   |cell_1|   |cell_2|   |cell_3|  ...   - The striped cell group 
(cell_0, cell_1, ...)
 +--+   +--+   +--+   +--+ 
 |cell_4|   |cell_5|   |cell_6|   |cell_7|  ...
 +--+   +--+   +--+   +--+
 |cell_8|   |cell_9|   |cell10|   |cell11|  ...
 +--+   +--+   +--+   +--+
  ... ...   ... ...
  
 
 We use following steps to recover striped cell group sequentially:
 step1: read minimum striped cells required by recovery.
 step2: decode cells for targets.
 step3: transfer cells to targets.
 
 In step1, try to read minimum striped cells, if there is corrupt or 
 stale sources, read from new source will be scheduled. The best sources
 are remembered for next round and may be updated in each round.
 
 In step2, It's blocked by HADOOP-11847, currently we only fill 1...
 to target block for test. Typically if source blocks we read are all data 
 blocks, we need to call encode, and if there is one parity block, we need 
 to call decode. Notice we only read once and recover all missed striped 
 block if they are more than one.
 
 In step3, send the recovered cells to targets by constructing packet 
 and send them directly. Same as continuous block replication, we 
 don't check the packet ack. Since the datanode doing the recovery work
 are one of the source datanodes, so the recovered cells are sent 
 remotely.
 
 There are some points we can do further improvements in next phase:
 1. we can read the block file directly on the local datanode, 
currently we use remote block reader. (Notice short-circuit is not
a good choice, see inline comments).
 2. We need to check the packet ack for EC recovery? Since EC recovery
is more expensive than continuous block replication, it needs to 
read from several other datanodes, should we make sure the 
recovered result received by targets? 
{noformat}

 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This JIRA is to recover one or more missed striped block in the striped block 
 group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8268) Port conflict log for data node server is not sufficient

2015-04-28 Thread Mohammad Shahid Khan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated HDFS-8268:
---
Attachment: (was: HDFS-8268)

 Port conflict log for data node server is not sufficient
 

 Key: HDFS-8268
 URL: https://issues.apache.org/jira/browse/HDFS-8268
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.0, 2.8.0
 Environment: x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan
Priority: Minor
   Original Estimate: 24h
  Remaining Estimate: 24h

 Data Node Server start up issue due to port conflict.
 The data node server port dfs.datanode.http.address conflict is not 
 sufficient to  identify the reason of failure.
 The exception log by the server is as below
 *Actual:*
 2015-04-27 16:48:53,960 FATAL 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
 java.net.BindException: Address already in use
   at sun.nio.ch.Net.bind0(Native Method)
   at sun.nio.ch.Net.bind(Net.java:437)
   at sun.nio.ch.Net.bind(Net.java:429)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
   at 
 io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
   at 
 io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
   at 
 io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
   at 
 io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455)
   at 
 io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440)
   at 
 io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
   at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194)
   at 
 io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340)
   at 
 io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
   at 
 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
   at 
 io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
   at java.lang.Thread.run(Thread.java:745)
 *_The above log does not contain the information of the conflicting port._*
 *Expected output:*
 java.net.BindException: Problem binding to [0.0.0.0:50075] 
 java.net.BindException: Address already in use; For more details see:  
 http://wiki.apache.org/hadoop/BindException
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:721)
   at 
 org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.start(DatanodeHttpServer.java:160)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:795)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1142)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:439)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2420)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2298)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2349)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2540)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2564)
 Caused by: java.net.BindException: Address already in use
   at sun.nio.ch.Net.bind0(Native Method)
   at sun.nio.ch.Net.bind(Net.java:437)
   at sun.nio.ch.Net.bind(Net.java:429)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
   at 
 io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
   at 
 io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
   at 
 io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)

[jira] [Updated] (HDFS-8268) Port conflict log for data node server is not sufficient

2015-04-28 Thread Mohammad Shahid Khan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mohammad Shahid Khan updated HDFS-8268:
---
Attachment: HDFS-8268.patch

 Port conflict log for data node server is not sufficient
 

 Key: HDFS-8268
 URL: https://issues.apache.org/jira/browse/HDFS-8268
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.0, 2.8.0
 Environment: x86_64 x86_64 x86_64 GNU/Linux
Reporter: Mohammad Shahid Khan
Assignee: Mohammad Shahid Khan
Priority: Minor
 Attachments: HDFS-8268.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 Data Node Server start up issue due to port conflict.
 The data node server port dfs.datanode.http.address conflict is not 
 sufficient to  identify the reason of failure.
 The exception log by the server is as below
 *Actual:*
 2015-04-27 16:48:53,960 FATAL 
 org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
 java.net.BindException: Address already in use
   at sun.nio.ch.Net.bind0(Native Method)
   at sun.nio.ch.Net.bind(Net.java:437)
   at sun.nio.ch.Net.bind(Net.java:429)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
   at 
 io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
   at 
 io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
   at 
 io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1021)
   at 
 io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:455)
   at 
 io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:440)
   at 
 io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:844)
   at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:194)
   at 
 io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:340)
   at 
 io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
   at 
 io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
   at 
 io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
   at java.lang.Thread.run(Thread.java:745)
 *_The above log does not contain the information of the conflicting port._*
 *Expected output:*
 java.net.BindException: Problem binding to [0.0.0.0:50075] 
 java.net.BindException: Address already in use; For more details see:  
 http://wiki.apache.org/hadoop/BindException
   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
   at 
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
   at 
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
   at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
   at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
   at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:721)
   at 
 org.apache.hadoop.hdfs.server.datanode.web.DatanodeHttpServer.start(DatanodeHttpServer.java:160)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startInfoServer(DataNode.java:795)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1142)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.init(DataNode.java:439)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2420)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2298)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2349)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2540)
   at 
 org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2564)
 Caused by: java.net.BindException: Address already in use
   at sun.nio.ch.Net.bind0(Native Method)
   at sun.nio.ch.Net.bind(Net.java:437)
   at sun.nio.ch.Net.bind(Net.java:429)
   at 
 sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
   at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
   at 
 io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
   at 
 io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:475)
   at

[jira] [Updated] (HDFS-7348) Erasure Coding: perform stripping erasure decoding/recovery work given block reader and writer


 [ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu updated HDFS-7348:
-
Attachment: HDFS-7348.001.patch

 Erasure Coding: perform stripping erasure decoding/recovery work given block 
 reader and writer
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This assumes the facilities like block reader and writer are ready, 
 implements and performs erasure decoding/recovery work in *stripping* case 
 utilizing erasure codec and coder provided by the codec framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-5574) Remove buffer copy in BlockReader.skip


[ 
https://issues.apache.org/jira/browse/HDFS-5574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516809#comment-14516809
 ] 

Hadoop QA commented on HDFS-5574:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 23s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 4 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   9m 15s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |  11m 31s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   4m 46s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 42s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 38s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   5m 27s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:red}-1{color} | common tests |  22m 32s | Tests failed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 174m 54s | Tests failed in hadoop-hdfs. |
| | | 248m 40s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.ha.TestPipelinesFailover |
|   | hadoop.hdfs.TestRemoteBlockReader |
|   | hadoop.hdfs.TestDFSInputStream |
|   | hadoop.hdfs.server.namenode.TestStartup |
| Timed out tests | org.apache.hadoop.ha.TestZKFailoverControllerStress |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728717/HDFS-5574.008.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / feb68cb |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10429/artifact/patchprocess/checkstyle-result-diff.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10429/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10429/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10429/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10429/console |


This message was automatically generated.

 Remove buffer copy in BlockReader.skip
 --

 Key: HDFS-5574
 URL: https://issues.apache.org/jira/browse/HDFS-5574
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Binglin Chang
Assignee: Binglin Chang
Priority: Trivial
 Attachments: HDFS-5574.006.patch, HDFS-5574.007.patch, 
 HDFS-5574.008.patch, HDFS-5574.v1.patch, HDFS-5574.v2.patch, 
 HDFS-5574.v3.patch, HDFS-5574.v4.patch, HDFS-5574.v5.patch


 BlockReaderLocal.skip and RemoteBlockReader.skip uses a temp buffer to read 
 data to this buffer, it is not necessary. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality


 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

 Target Version/s: HDFS-7285
Affects Version/s: HDFS-7285
   Status: Patch Available  (was: In Progress)

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678.000.patch, 
 HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality


 [ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-7678:

Attachment: HDFS-7678-HDFS-7285.002.patch

New patch with a functional test. Also renaming to trigger Jenkins.

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8204) Mover/Balancer should not schedule two replicas to the same DN


[ 
https://issues.apache.org/jira/browse/HDFS-8204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517930#comment-14517930
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8204:
---

 Actually, It does support. ...

You are right. Thanks for point it out.

 Mover/Balancer should not schedule two replicas to the same DN
 --

 Key: HDFS-8204
 URL: https://issues.apache.org/jira/browse/HDFS-8204
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Attachments: HDFS-8204.001.patch, HDFS-8204.002.patch, 
 HDFS-8204.003.patch


 Balancer moves blocks between Datanode(Ver. 2.6 ).
 Balancer moves blocks between StorageGroups ( introduced by HDFS-6584) , in 
 the new version(Ver. =2.6) .
 function
 {code}
 class DBlock extends LocationsStorageGroup
 DBlock.isLocatedOn(StorageGroup loc)
 {code}
 -is flawed, may causes 2 replicas ends in same node after running balance.-
 For example:
 We have 2 nodes. Each node has two storages.
 We have (DN0, SSD), (DN0, DISK), (DN1, SSD), (DN1, DISK).
 We have a block with ONE_SSD storage policy.
 The block has 2 replicas. They are in (DN0,SSD) and (DN1,DISK).
 Replica in (DN0,SSD) should not be moved to (DN1,SSD) after running Balancer.
 Otherwise DN1 has 2 replicas.
 --
 UPDATE(Thanks [~szetszwo] for pointing it out):
 {color:red}
 This bug will *NOT* causes 2 replicas end in same node after running balance, 
 thanks to Datanode rejecting it. 
 {color}
 We see a lot of ERROR when running test.
 {code}
 2015-04-27 10:08:15,809 ERROR datanode.DataNode (DataXceiver.java:run(277)) - 
 host1.foo.com:59537:DataXceiver error processing REPLACE_BLOCK operation  
 src: /127.0.0.1:52532 dst: /127.0.0.1:59537
 org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
 BP-264794661-9.96.1.34-1430100451121:blk_1073741825_1001 already exists in 
 state FINALIZED and thus cannot be created.
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1447)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:186)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.replaceBlock(DataXceiver.java:1158)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReplaceBlock(Receiver.java:229)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:77)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250)
 at java.lang.Thread.run(Thread.java:722)
 {code}
 The Balancer runs 5~20 times iterations in the test, before it exits.
 It's ineffecient.
 Balancer should not *schedule* it in the first place, even though it'll 
 failed anyway. In the test, it should exit after 5 times iteration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8204) Mover/Balancer should not schedule two replicas to the same DN


 [ 
https://issues.apache.org/jira/browse/HDFS-8204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8204:
--
   Resolution: Fixed
Fix Version/s: 2.7.1
   Status: Resolved  (was: Patch Available)

I have committed this.  Thanks, Walter!

 Mover/Balancer should not schedule two replicas to the same DN
 --

 Key: HDFS-8204
 URL: https://issues.apache.org/jira/browse/HDFS-8204
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Fix For: 2.7.1

 Attachments: HDFS-8204.001.patch, HDFS-8204.002.patch, 
 HDFS-8204.003.patch


 Balancer moves blocks between Datanode(Ver. 2.6 ).
 Balancer moves blocks between StorageGroups ( introduced by HDFS-6584) , in 
 the new version(Ver. =2.6) .
 function
 {code}
 class DBlock extends LocationsStorageGroup
 DBlock.isLocatedOn(StorageGroup loc)
 {code}
 -is flawed, may causes 2 replicas ends in same node after running balance.-
 For example:
 We have 2 nodes. Each node has two storages.
 We have (DN0, SSD), (DN0, DISK), (DN1, SSD), (DN1, DISK).
 We have a block with ONE_SSD storage policy.
 The block has 2 replicas. They are in (DN0,SSD) and (DN1,DISK).
 Replica in (DN0,SSD) should not be moved to (DN1,SSD) after running Balancer.
 Otherwise DN1 has 2 replicas.
 --
 UPDATE(Thanks [~szetszwo] for pointing it out):
 {color:red}
 This bug will *NOT* causes 2 replicas end in same node after running balance, 
 thanks to Datanode rejecting it. 
 {color}
 We see a lot of ERROR when running test.
 {code}
 2015-04-27 10:08:15,809 ERROR datanode.DataNode (DataXceiver.java:run(277)) - 
 host1.foo.com:59537:DataXceiver error processing REPLACE_BLOCK operation  
 src: /127.0.0.1:52532 dst: /127.0.0.1:59537
 org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
 BP-264794661-9.96.1.34-1430100451121:blk_1073741825_1001 already exists in 
 state FINALIZED and thus cannot be created.
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1447)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:186)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.replaceBlock(DataXceiver.java:1158)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReplaceBlock(Receiver.java:229)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:77)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250)
 at java.lang.Thread.run(Thread.java:722)
 {code}
 The Balancer runs 5~20 times iterations in the test, before it exits.
 It's ineffecient.
 Balancer should not *schedule* it in the first place, even though it'll 
 failed anyway. In the test, it should exit after 5 times iteration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8204) Mover/Balancer should not schedule two replicas to the same DN


[ 
https://issues.apache.org/jira/browse/HDFS-8204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517943#comment-14517943
 ] 

Hudson commented on HDFS-8204:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #7694 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7694/])
HDFS-8204. Mover/Balancer should not schedule two replicas to the same 
datanode.  Contributed by Walter Su (szetszwo: rev 
5639bf02da716b3ecda785979b3d08cdca15972d)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/balancer/TestBalancer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Mover/Balancer should not schedule two replicas to the same DN
 --

 Key: HDFS-8204
 URL: https://issues.apache.org/jira/browse/HDFS-8204
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: balancer  mover
Reporter: Walter Su
Assignee: Walter Su
Priority: Minor
 Fix For: 2.7.1

 Attachments: HDFS-8204.001.patch, HDFS-8204.002.patch, 
 HDFS-8204.003.patch


 Balancer moves blocks between Datanode(Ver. 2.6 ).
 Balancer moves blocks between StorageGroups ( introduced by HDFS-6584) , in 
 the new version(Ver. =2.6) .
 function
 {code}
 class DBlock extends LocationsStorageGroup
 DBlock.isLocatedOn(StorageGroup loc)
 {code}
 -is flawed, may causes 2 replicas ends in same node after running balance.-
 For example:
 We have 2 nodes. Each node has two storages.
 We have (DN0, SSD), (DN0, DISK), (DN1, SSD), (DN1, DISK).
 We have a block with ONE_SSD storage policy.
 The block has 2 replicas. They are in (DN0,SSD) and (DN1,DISK).
 Replica in (DN0,SSD) should not be moved to (DN1,SSD) after running Balancer.
 Otherwise DN1 has 2 replicas.
 --
 UPDATE(Thanks [~szetszwo] for pointing it out):
 {color:red}
 This bug will *NOT* causes 2 replicas end in same node after running balance, 
 thanks to Datanode rejecting it. 
 {color}
 We see a lot of ERROR when running test.
 {code}
 2015-04-27 10:08:15,809 ERROR datanode.DataNode (DataXceiver.java:run(277)) - 
 host1.foo.com:59537:DataXceiver error processing REPLACE_BLOCK operation  
 src: /127.0.0.1:52532 dst: /127.0.0.1:59537
 org.apache.hadoop.hdfs.server.datanode.ReplicaAlreadyExistsException: Block 
 BP-264794661-9.96.1.34-1430100451121:blk_1073741825_1001 already exists in 
 state FINALIZED and thus cannot be created.
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1447)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:186)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.replaceBlock(DataXceiver.java:1158)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReplaceBlock(Receiver.java:229)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:77)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:250)
 at java.lang.Thread.run(Thread.java:722)
 {code}
 The Balancer runs 5~20 times iterations in the test, before it exits.
 It's ineffecient.
 Balancer should not *schedule* it in the first place, even though it'll 
 failed anyway. In the test, it should exit after 5 times iteration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery


[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518362#comment-14518362
 ] 

Zhe Zhang commented on HDFS-7348:
-

Please find detailed comments below:

Logics:
# Since recovering multiple missing blocks at once is a pretty rare case, 
should we just reconstruct all missing blocks and use {{DataNode#DataTransfer}} 
to push them out?
# I filed HDFS-8282 to move {{StripedReadResult}} and {{waitNextCompletion}} to 
{{StripedBlockUtil}}.
# In foreground recovery we read in parallel to minimize latency. It's an 
interesting design question whether we should we do the same in background 
recovery. More discussions are needed here.
# If we do choose to read source blocks in parallel, how should we design the 
unit of sync-and-decode? Right now the readers read a cell at a time. Another 
option is to read entire blocks and then decode. The drawback is larger 
temporary memory usage. The benefits are: i) simpler logic (no need to recreate 
reading threads) and avoid the overhead of initializing connection to source 
DNs; ii) maintain open connections as short as possible (fast readers don't 
need to wait for slow ones); iii) Does it save CPU to decode in big chunks? 
[~drankye] Could you advise? 
# Should we save a copy of reconstructed block locally? More space will be 
used; but it will avoid re-decoding if push fails. 

Nits:
# Could use {{ArrayList}}
{code}
stripedReaders = new ArrayListStripedReader(sources.length);
{code}
# Maybe we can move {{getBlock}} to {{StripedBlockUtil}} too; it's a useful 
util to only parse the {{Block}}. If it sounds good to you I'll move it in 
HDFS-8282.

 Erasure Coding: striped block recovery
 --

 Key: HDFS-7348
 URL: https://issues.apache.org/jira/browse/HDFS-7348
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
 Attachments: ECWorker.java, HDFS-7348.001.patch


 This JIRA is to recover one or more missed striped block in the striped block 
 group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8273) FSNamesystem#Delete() should not call logSync() when holding the lock


 [ 
https://issues.apache.org/jira/browse/HDFS-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8273:
-
   Resolution: Fixed
Fix Version/s: 2.7.1
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks Jing for the reviews.

 FSNamesystem#Delete() should not call logSync() when holding the lock
 -

 Key: HDFS-8273
 URL: https://issues.apache.org/jira/browse/HDFS-8273
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Jing Zhao
Assignee: Haohui Mai
Priority: Blocker
 Fix For: 2.7.1

 Attachments: HDFS-8273.000.patch, HDFS-8273.001.patch


 HDFS-7573 moves the logSync call inside of the write lock by accident. We 
 should move it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8284) Add usage of tracing originated in DFSClient to doc

2015-04-28 Thread Masatake Iwasaki (JIRA)

Masatake Iwasaki created HDFS-8284:
--

 Summary: Add usage of tracing originated in DFSClient to doc
 Key: HDFS-8284
 URL: https://issues.apache.org/jira/browse/HDFS-8284
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Masatake Iwasaki
Assignee: Masatake Iwasaki


Tracing originated in DFSClient uses configuration keys prefixed with 
dfs.client.htrace after HDFS-8213. Server side tracing uses conf keys 
prefixed with dfs.htrace.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7687) Change fsck to support EC files


[ 
https://issues.apache.org/jira/browse/HDFS-7687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518265#comment-14518265
 ] 

Tsz Wo Nicholas Sze commented on HDFS-7687:
---

Both JIRAs are merged to the branch now.

 Change fsck to support EC files
 ---

 Key: HDFS-7687
 URL: https://issues.apache.org/jira/browse/HDFS-7687
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo Nicholas Sze
Assignee: Takanobu Asanuma

 We need to change fsck so that it can detect under replicated and corrupted 
 EC files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead

2015-04-28 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518394#comment-14518394
 ] 

Colin Patrick McCabe commented on HDFS-7758:


can you rebase the patch on trunk?  thanks

 Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
 -

 Key: HDFS-7758
 URL: https://issues.apache.org/jira/browse/HDFS-7758
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.6.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch, 
 HDFS-7758.002.patch, HDFS-7758.003.patch, HDFS-7758.004.patch, 
 HDFS-7758.005.patch


 HDFS-7496 introduced reference-counting  the volume instances being used to 
 prevent race condition when hot swapping a volume.
 However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance 
 without increasing its reference count. In this JIRA, we retire the 
 {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} 
 and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer 
 of {{FsVolume}} always has correct reference count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8283) DataStreamer cleanup and some minor improvement


 [ 
https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8283:
--
Attachment: h8283_20150428.patch

h8283_20150428.patch: 1st patch.

 DataStreamer cleanup and some minor improvement
 ---

 Key: HDFS-8283
 URL: https://issues.apache.org/jira/browse/HDFS-8283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h8283_20150428.patch


 - When throwing an exception
 -* always set lastException 
 -* always creating a new exception so that it has the new stack trace
 - Add LOG.
 - Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8283) DataStreamer cleanup and some minor improvement

Tsz Wo Nicholas Sze created HDFS-8283:
-

 Summary: DataStreamer cleanup and some minor improvement
 Key: HDFS-8283
 URL: https://issues.apache.org/jira/browse/HDFS-8283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor


- When throwing an exception
-* always set lastException 
-* always creating a new exception so that it has the new stack trace
- Add LOG.
- Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8273) logSync() is called inside of write lock for delete op

2015-04-28 Thread Jing Zhao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518492#comment-14518492
 ] 

Jing Zhao commented on HDFS-8273:
-

+1 for the latest patch. Thanks for the fix, Haohui!

 logSync() is called inside of write lock for delete op
 --

 Key: HDFS-8273
 URL: https://issues.apache.org/jira/browse/HDFS-8273
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Jing Zhao
Assignee: Haohui Mai
Priority: Blocker
 Attachments: HDFS-8273.000.patch, HDFS-8273.001.patch


 HDFS-7573 moves the logSync call inside of the write lock by accident. We 
 should move it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7980) Incremental BlockReport will dramatically slow down the startup of a namenode

2015-04-28 Thread Walter Su (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su updated HDFS-7980:

Attachment: HDFS-7980.004.patch

 Incremental BlockReport will dramatically slow down the startup of  a namenode
 --

 Key: HDFS-7980
 URL: https://issues.apache.org/jira/browse/HDFS-7980
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Hui Zheng
Assignee: Walter Su
 Attachments: HDFS-7980.001.patch, HDFS-7980.002.patch, 
 HDFS-7980.003.patch, HDFS-7980.004.patch


 In the current implementation the datanode will call the 
 reportReceivedDeletedBlocks() method that is a IncrementalBlockReport before 
 calling the bpNamenode.blockReport() method. So in a large(several thousands 
 of datanodes) and busy cluster it will slow down(more than one hour) the 
 startup of namenode. 
 {code}
 ListDatanodeCommand blockReport() throws IOException {
 // send block report if timer has expired.
 final long startTime = now();
 if (startTime - lastBlockReport = dnConf.blockReportInterval) {
   return null;
 }
 final ArrayListDatanodeCommand cmds = new ArrayListDatanodeCommand();
 // Flush any block information that precedes the block report. Otherwise
 // we have a chance that we will miss the delHint information
 // or we will report an RBW replica after the BlockReport already reports
 // a FINALIZED one.
 reportReceivedDeletedBlocks();
 lastDeletedReport = startTime;
 .
 // Send the reports to the NN.
 int numReportsSent = 0;
 int numRPCs = 0;
 boolean success = false;
 long brSendStartTime = now();
 try {
   if (totalBlockCount  dnConf.blockReportSplitThreshold) {
 // Below split threshold, send all reports in a single message.
 DatanodeCommand cmd = bpNamenode.blockReport(
 bpRegistration, bpos.getBlockPoolId(), reports);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7770) Need document for storage type label of data node storage locations under dfs.data.dir


[ 
https://issues.apache.org/jira/browse/HDFS-7770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518454#comment-14518454
 ] 

Hadoop QA commented on HDFS-7770:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  17m 32s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 32s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 23s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | site |   2m 54s | Site still builds. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | native |   3m 12s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 163m 21s | Tests failed in hadoop-hdfs. |
| | | 206m 31s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728926/HDFS-7770.02.patch |
| Optional Tests | javadoc javac unit site |
| git revision | trunk / 5190923 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10437/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10437/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10437/console |


This message was automatically generated.

 Need document for storage type label of data node storage locations under 
 dfs.data.dir
 --

 Key: HDFS-7770
 URL: https://issues.apache.org/jira/browse/HDFS-7770
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Affects Versions: 2.6.0
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7700.01.patch, HDFS-7770.00.patch, 
 HDFS-7770.02.patch


 HDFS-2832 enables support for heterogeneous storages in HDFS, which allows DN 
 as a collection of storages with different types. However, I can't find 
 document on how to label different storage types from the following two 
 documents. I found the information from the design spec. It will be good we 
 document this for admins and users to use the related Archival storage and 
 storage policy features. 
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/ArchivalStorage.html
 http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
 This JIRA is opened to add document for the new storage type labels. 
 1. Add an example under ArchivalStorage.html#Configuration section:
 {code}
   property
 namedfs.data.dir/name
 value[DISK]file:///hddata/dn/disk0,  
 [SSD]file:///hddata/dn/ssd0,[ARCHIVE]file:///hddata/dn/archive0/value
   /property
 {code}
 2. Add a short description of [DISK/SSD/ARCHIVE/RAM_DISK] options in 
 hdfs-default.xml#dfs.data.dir and document DISK as storage type if no storage 
 type is labeled in the data node storage location configuration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8283) DataStreamer cleanup and some minor improvement


 [ 
https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo Nicholas Sze updated HDFS-8283:
--
Status: Patch Available  (was: Open)

 DataStreamer cleanup and some minor improvement
 ---

 Key: HDFS-8283
 URL: https://issues.apache.org/jira/browse/HDFS-8283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h8283_20150428.patch


 - When throwing an exception
 -* always set lastException 
 -* always creating a new exception so that it has the new stack trace
 - Add LOG.
 - Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8273) logSync() is called inside of write lock for delete op


[ 
https://issues.apache.org/jira/browse/HDFS-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518468#comment-14518468
 ] 

Hadoop QA commented on HDFS-8273:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 34s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 27s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   7m 13s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  3s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests | 164m 40s | Tests passed in hadoop-hdfs. 
|
| | | 212m 19s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728929/HDFS-8273.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5190923 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10436/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10436/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf900.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10436/console |


This message was automatically generated.

 logSync() is called inside of write lock for delete op
 --

 Key: HDFS-8273
 URL: https://issues.apache.org/jira/browse/HDFS-8273
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Jing Zhao
Assignee: Haohui Mai
Priority: Blocker
 Attachments: HDFS-8273.000.patch, HDFS-8273.001.patch


 HDFS-7573 moves the logSync call inside of the write lock by accident. We 
 should move it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8272) Erasure Coding: simplify the retry logic in DFSStripedInputStream

2015-04-28 Thread Jing Zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8272:

Attachment: h8272-HDFS-7285.001.patch

Thanks again for the review, Zhe! Update the patch to address your comments 
(including DFSInputStream changes). The main change is to only refetch the 
key/token once for the group.

About the encryption key retry logic, I think it is handled while creating the 
block reader. More specifically, while creating the TCP peer in 
{{BlockReaderFactory#getRemoteBlockReaderFromTcp}}, the sasl protocol is 
triggered during which the encryptionKey can be refetched.

 Erasure Coding: simplify the retry logic in DFSStripedInputStream
 -

 Key: HDFS-8272
 URL: https://issues.apache.org/jira/browse/HDFS-8272
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: h8272-HDFS-7285.000.patch, h8272-HDFS-7285.001.patch


 Currently in DFSStripedInputStream the retry logic is still the same with 
 DFSInputStream. More specifically, every failed read will try to search for 
 another source node. And an exception is thrown when no new source node can 
 be identified. This logic is not appropriate for EC inputstream and can be 
 simplified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7281) Missing block is marked as corrupted block

2015-04-28 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518376#comment-14518376
 ] 

Yongjun Zhang commented on HDFS-7281:
-

Hi [~mingma],

I think we can get this fix to trunk targetting 3.0, and follow up with other 
improvement like [~andrew.wang] proposed in the email thread.

Would you please take a look at comment at
https://issues.apache.org/jira/browse/HDFS-7281?focusedCommentId=14510451page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14510451
?

Thanks.



 Missing block is marked as corrupted block
 --

 Key: HDFS-7281
 URL: https://issues.apache.org/jira/browse/HDFS-7281
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
  Labels: supportability
 Attachments: HDFS-7281-2.patch, HDFS-7281-3.patch, HDFS-7281-4.patch, 
 HDFS-7281.patch


 In the situation where the block lost all its replicas, fsck shows the block 
 is missing as well as corrupted. Perhaps it is better not to mark the block 
 corrupted in this case. The reason it is marked as corrupted is 
 numCorruptNodes == numNodes == 0 in the following code.
 {noformat}
 BlockManager
 final boolean isCorrupt = numCorruptNodes == numNodes;
 {noformat}
 Would like to clarify if it is the intent to mark missing block as corrupted 
 or it is just a bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead

2015-04-28 Thread Lei (Eddy) Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-7758:

Attachment: HDFS-7758.006.patch

Thanks for looking into this, [~cmccabe]. Upload a rebased patch.

 Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
 -

 Key: HDFS-7758
 URL: https://issues.apache.org/jira/browse/HDFS-7758
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.6.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch, 
 HDFS-7758.002.patch, HDFS-7758.003.patch, HDFS-7758.004.patch, 
 HDFS-7758.005.patch, HDFS-7758.006.patch


 HDFS-7496 introduced reference-counting  the volume instances being used to 
 prevent race condition when hot swapping a volume.
 However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance 
 without increasing its reference count. In this JIRA, we retire the 
 {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} 
 and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer 
 of {{FsVolume}} always has correct reference count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7397) The conf key dfs.client.read.shortcircuit.streams.cache.size is misleading


[ 
https://issues.apache.org/jira/browse/HDFS-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518487#comment-14518487
 ] 

Hadoop QA commented on HDFS-7397:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m 23s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 45s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 56s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | native |   3m 19s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 164m 55s | Tests failed in hadoop-hdfs. |
| | | 203m 59s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
| Timed out tests | org.apache.hadoop.hdfs.server.mover.TestMover |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728676/HDFS-7397-002.patch |
| Optional Tests | javadoc javac unit |
| git revision | trunk / 5190923 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10438/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10438/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10438/console |


This message was automatically generated.

 The conf key dfs.client.read.shortcircuit.streams.cache.size is misleading
 

 Key: HDFS-7397
 URL: https://issues.apache.org/jira/browse/HDFS-7397
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Brahma Reddy Battula
Priority: Minor
 Attachments: HDFS-7397-002.patch, HDFS-7397.patch


 For dfs.client.read.shortcircuit.streams.cache.size, is it in MB or KB?  
 Interestingly, it is neither in MB nor KB.  It is the number of shortcircuit 
 streams.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HDFS-8282) Erasure coding: move striped reading logic to StripedBlockUtil


 [ 
https://issues.apache.org/jira/browse/HDFS-8282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-8282 started by Zhe Zhang.
---
 Erasure coding: move striped reading logic to StripedBlockUtil
 --

 Key: HDFS-8282
 URL: https://issues.apache.org/jira/browse/HDFS-8282
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8282) Erasure coding: move striped reading logic to StripedBlockUtil

Zhe Zhang created HDFS-8282:
---

 Summary: Erasure coding: move striped reading logic to 
StripedBlockUtil
 Key: HDFS-8282
 URL: https://issues.apache.org/jira/browse/HDFS-8282
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality


[ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518415#comment-14518415
 ] 

Hadoop QA commented on HDFS-7678:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 30s | Pre-patch HDFS-7285 compilation 
is healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 27s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:red}-1{color} | release audit |   0m 15s | The applied patch generated 
1 release audit warnings. |
| {color:red}-1{color} | checkstyle |   5m 31s | The applied patch generated  3 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 31s | The patch built with 
eclipse:eclipse. |
| {color:red}-1{color} | findbugs |   3m 55s | The patch appears to introduce 9 
new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 13s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 196m 58s | Tests failed in hadoop-hdfs. |
| {color:red}-1{color} | hdfs tests |   0m 20s | Tests failed in 
hadoop-hdfs-client. |
| | | 243m 54s | |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-client |
|  |  org.apache.hadoop.hdfs.protocol.LocatedStripedBlock.getBlockIndices() may 
expose internal representation by returning LocatedStripedBlock.blockIndices  
At LocatedStripedBlock.java:by returning LocatedStripedBlock.blockIndices  At 
LocatedStripedBlock.java:[line 57] |
| FindBugs | module:hadoop-hdfs |
|  |  Inconsistent synchronization of 
org.apache.hadoop.hdfs.DFSOutputStream.streamer; locked 89% of time  
Unsynchronized access at DFSOutputStream.java:89% of time  Unsynchronized 
access at DFSOutputStream.java:[line 142] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.DFSStripedInputStream.planReadPortions(int, int, long, 
int, int)  At DFSStripedInputStream.java:to long in 
org.apache.hadoop.hdfs.DFSStripedInputStream.planReadPortions(int, int, long, 
int, int)  At DFSStripedInputStream.java:[line 101] |
|  |  Dead store to offSuccess in 
org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  At 
StripedDataStreamer.java:org.apache.hadoop.hdfs.StripedDataStreamer.endBlock()  
At StripedDataStreamer.java:[line 104] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:to long in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStriped.spaceConsumed()  
At BlockInfoStriped.java:[line 208] |
|  |  Possible null pointer dereference of arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:arr$ in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoStripedUnderConstruction.initializeBlockRecovery(long)
  Dereferenced at BlockInfoStripedUnderConstruction.java:[line 206] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema):in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.createErasureCodingZone(String,
 ECSchema): String.getBytes()  At ErasureCodingZoneManager.java:[line 116] |
|  |  Found reliance on default encoding in 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):in
 
org.apache.hadoop.hdfs.server.namenode.ErasureCodingZoneManager.getECZoneInfo(INodesInPath):
 new String(byte[])  At ErasureCodingZoneManager.java:[line 81] |
|  |  Result of integer multiplication cast to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:to long in 
org.apache.hadoop.hdfs.util.StripedBlockUtil.constructInternalBlock(LocatedStripedBlock,
 int, int, int, int)  At StripedBlockUtil.java:[line 75] |
| Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.TestDFSStripedInputStream |
|   | hadoop.hdfs.TestReadStripedFile |
| Timed out tests | org.apache.hadoop.hdfs.server.namenode.TestFSEditLogLoader |
|   | org.apache.hadoop.hdfs.server.namenode.TestAddStripedBlocks |
| Failed build | hadoop-hdfs-client |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL |

[jira] [Commented] (HDFS-8273) logSync() is called inside of write lock for delete op


[ 
https://issues.apache.org/jira/browse/HDFS-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518444#comment-14518444
 ] 

Hadoop QA commented on HDFS-8273:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 29s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 27s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 33s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   7m 46s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | install |   1m 32s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  4s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 14s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 163m 40s | Tests failed in hadoop-hdfs. |
| | | 211m 44s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728919/HDFS-8273.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bc1bd7e |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10435/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10435/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10435/console |


This message was automatically generated.

 logSync() is called inside of write lock for delete op
 --

 Key: HDFS-8273
 URL: https://issues.apache.org/jira/browse/HDFS-8273
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Jing Zhao
Assignee: Haohui Mai
Priority: Blocker
 Attachments: HDFS-8273.000.patch, HDFS-8273.001.patch


 HDFS-7573 moves the logSync call inside of the write lock by accident. We 
 should move it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality


[ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518465#comment-14518465
 ] 

Zhe Zhang commented on HDFS-7678:
-

Thanks Andrew for the review; it's very helpful. Some quick feedback while I 
work on the harder parts:

bq. waitNextCompletion, shouldn't the read timeout be an overall timeout
Great idea. Otherwise the timeout policy is too strict in the beginning and too 
loose toward the end. 

bq. throwing InterruptedException on empty futures
This part (also the main {{waitNextCompletion}} logic) was actually inherited 
from {{DFSInputStream#getFirstToComplete}}. I think we should take care of this 
issue together with other {{InterruptedException}} updates in the planed 
follow-on JIRA (against trunk). I will update {{waitNextCompletion}} to get rid 
of this {{InterruptedException}} under this JIRA.

bq. Do we actually need missingBlkIndices or the non-success cases? It's the 
set complement of fetchedBlkIndices.
Not really. {{missingBlkIndices}} has all _confirmed_ missing blocks while 
{{fetchedBlkIndices}} has all fetched blocks that _cover the max missing span_. 
For example, if cell size is 4k, you want to read range 2k~8k, and block #1 is 
missing, then {{missingBlkIndices}} should contain only _1_ and 
{{fetchedBlkIndices}} is empty, since block #0 needs to be refetched (we only 
have half of it for recovery).

bq. We always go through a function called fetchExtraBlks...
Good catch and I think that's what fails {{TestReadStripedFile}} and 
{{TestDFSStripedInputStream}}

bq. I wonder if it'd be better to do all the fetching first (including parity 
if necessary),
It's an appealing idea. [~hitliuyi] has an interesting logic of inserting a 
new Future when finding a failed Future logic under HDFS-7348. I'll try to 
leverage that.

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7995) Implement chmod in the HDFS Web UI


[ 
https://issues.apache.org/jira/browse/HDFS-7995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518482#comment-14518482
 ] 

Haohui Mai commented on HDFS-7995:
--

Sorry for the late reply. Thanks for the work.

{code}
+  div class=modal id=perm-info tabindex=-1 role=dialog 
aria-hidden=true
{code}

It makes sense to give the id a prefix (e.g. {{explorer}}) to avoid confusion. 
The same comment applies to things like {{perm-heading}}, etc.

{code}
+tdspan class=explorer-perm-links editable-click
+  {type|helper_to_directory}{permission|helper_to_permission}
+  {aclBit|helper_to_acl_bit}
+/span/td

-tda style=cursor:pointer inode-type={type} 
class=explorer-browse-links inode-path={pathSuffix}{pathSuffix}/a/td
+tda style=cursor:pointer inode-type={type} 
class=explorer-browse-links{pathSuffix}/a/td{code}

The change seems unnecessary.


{code}
+  function view_perm_details(filename, abs_path, perms) {
{code}

There is no need to parse the permission from the string as the original data 
is available in the {{LISTSTATUS}} call. What you can do is to expose it 
through a data field. e.g.,

{code}
+  tr inode-path={pathSuffix} data-permission={permission}
{code}

{code}
+  function convertCheckboxesToOctalPermissions() {
{code}

It is easier to calculate the permission by expose the location of the bit 
using an attribute. e.g., 
{code}
var p = 0;
$.each('perm inputbox:checked').function() {
  p += 1  (+$(this).attr('data-bit'));
}
return p.toString(8);
{code}

 Implement chmod in the HDFS Web UI
 --

 Key: HDFS-7995
 URL: https://issues.apache.org/jira/browse/HDFS-7995
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-7995.01.patch, HDFS-7995.02.patch


 We should let users change the permissions of files and directories using the 
 HDFS Web UI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8056) Decommissioned dead nodes should continue to be counted as dead after NN restart

2015-04-28 Thread Ming Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518357#comment-14518357
 ] 

Ming Ma commented on HDFS-8056:
---

[~andrew.wang] and others, appreciate any input you might have.

 Decommissioned dead nodes should continue to be counted as dead after NN 
 restart
 

 Key: HDFS-8056
 URL: https://issues.apache.org/jira/browse/HDFS-8056
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Ming Ma
Assignee: Ming Ma
 Attachments: HDFS-8056-2.patch, HDFS-8056.patch


 We had some offline discussion with [~andrew.wang] and [~cmccabe] about this. 
 Bring this up for more input and get the patch in place.
 Dead nodes are tracked by {{DatanodeManager}}'s {{datanodeMap}}. However, 
 after NN restarts, those nodes that were dead before NN restart won't be in 
 {{datanodeMap}}. {{DatanodeManager}}'s {{getDatanodeListForReport}} will add 
 those dead nodes, but not if they are in the exclude file.
 {noformat}
 if (listDeadNodes) {
   for (InetSocketAddress addr : includedNodes) {
 if (foundNodes.matchedBy(addr) || excludedNodes.match(addr)) {
   continue;
 }
 // The remaining nodes are ones that are referenced by the hosts
 // files but that we do not know about, ie that we have never
 // head from. Eg. an entry that is no longer part of the cluster
 // or a bogus entry was given in the hosts files
 //
 // If the host file entry specified the xferPort, we use that.
 // Otherwise, we guess that it is the default xfer port.
 // We can't ask the DataNode what it had configured, because it's
 // dead.
 DatanodeDescriptor dn = new DatanodeDescriptor(new DatanodeID(addr
 .getAddress().getHostAddress(), addr.getHostName(), ,
 addr.getPort() == 0 ? defaultXferPort : addr.getPort(),
 defaultInfoPort, defaultInfoSecurePort, defaultIpcPort));
 setDatanodeDead(dn);
 nodes.add(dn);
   }
 }
 {noformat}
 The issue here is the decommissioned dead node JMX will be different after NN 
 restart. It might be better to make it consistent across NN restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7995) Implement chmod in the HDFS Web UI

2015-04-28 Thread Allen Wittenauer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518370#comment-14518370
 ] 

Allen Wittenauer commented on HDFS-7995:


+1 lgtm


 Implement chmod in the HDFS Web UI
 --

 Key: HDFS-7995
 URL: https://issues.apache.org/jira/browse/HDFS-7995
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-7995.01.patch, HDFS-7995.02.patch


 We should let users change the permissions of files and directories using the 
 HDFS Web UI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8213) DFSClient should use hdfs.client.htrace HTrace configuration prefix rather than hadoop.htrace

2015-04-28 Thread Colin Patrick McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-8213:
---
Attachment: HDFS-8213.002.patch

 DFSClient should use hdfs.client.htrace HTrace configuration prefix rather 
 than hadoop.htrace
 -

 Key: HDFS-8213
 URL: https://issues.apache.org/jira/browse/HDFS-8213
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Billie Rinaldi
Assignee: Colin Patrick McCabe
Priority: Critical
 Attachments: HDFS-8213.001.patch, HDFS-8213.002.patch


 DFSClient initializing SpanReceivers is a problem for Accumulo, which manages 
 SpanReceivers through its own configuration.  This results in the same 
 receivers being registered multiple times and spans being delivered more than 
 once.  The documentation says SpanReceiverHost.getInstance should be issued 
 once per process, so there is no expectation that DFSClient should do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8213) DFSClient should use hdfs.client.htrace HTrace configuration prefix rather than hadoop.htrace

2015-04-28 Thread Colin Patrick McCabe (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518389#comment-14518389
 ] 

Colin Patrick McCabe commented on HDFS-8213:


Thanks for the review, [~iwasakims].  I attached a patch.  Let's do the 
hdfs-default.xml and other docs stuff later since it's not directly related to 
this

 DFSClient should use hdfs.client.htrace HTrace configuration prefix rather 
 than hadoop.htrace
 -

 Key: HDFS-8213
 URL: https://issues.apache.org/jira/browse/HDFS-8213
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Billie Rinaldi
Assignee: Colin Patrick McCabe
Priority: Critical
 Attachments: HDFS-8213.001.patch, HDFS-8213.002.patch


 DFSClient initializing SpanReceivers is a problem for Accumulo, which manages 
 SpanReceivers through its own configuration.  This results in the same 
 receivers being registered multiple times and spans being delivered more than 
 once.  The documentation says SpanReceiverHost.getInstance should be issued 
 once per process, so there is no expectation that DFSClient should do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7995) Implement chmod in the HDFS Web UI


[ 
https://issues.apache.org/jira/browse/HDFS-7995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518425#comment-14518425
 ] 

Haohui Mai commented on HDFS-7995:
--

I think the code requires some more clean up on unused ids in the HTML. Maybe 
we should replace the {{closest()}} call with something more performant.

 Implement chmod in the HDFS Web UI
 --

 Key: HDFS-7995
 URL: https://issues.apache.org/jira/browse/HDFS-7995
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-7995.01.patch, HDFS-7995.02.patch


 We should let users change the permissions of files and directories using the 
 HDFS Web UI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8269) getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime


 [ 
https://issues.apache.org/jira/browse/HDFS-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8269:
-
Attachment: HDFS-8269.003.patch

 getBlockLocations() does not resolve the .reserved path and generates 
 incorrect edit logs when updating the atime
 -

 Key: HDFS-8269
 URL: https://issues.apache.org/jira/browse/HDFS-8269
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Haohui Mai
Priority: Blocker
 Attachments: HDFS-8269.000.patch, HDFS-8269.001.patch, 
 HDFS-8269.002.patch, HDFS-8269.003.patch


 When {{FSNamesystem#getBlockLocations}} updates the access time of the INode, 
 it uses the path passed from the client, which generates incorrect edit logs 
 entries:
 {noformat}
   RECORD
 OPCODEOP_TIMES/OPCODE
 DATA
   TXID5085/TXID
   LENGTH0/LENGTH
   PATH/.reserved/.inodes/18230/PATH
   MTIME-1/MTIME
   ATIME1429908236392/ATIME
 /DATA
   /RECORD
 {noformat}
 Note that the NN does not resolve the {{/.reserved}} path when processing the 
 edit log, therefore it eventually leads to a NPE when loading the edit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8280) Code Cleanup in DFSInputStream


[ 
https://issues.apache.org/jira/browse/HDFS-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518498#comment-14518498
 ] 

Hadoop QA commented on HDFS-8280:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  7s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 38s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 44s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 26s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   4m  1s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 11s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 18s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests | 165m 44s | Tests passed in hadoop-hdfs. 
|
| | | 211m 23s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728938/HDFS-8280.000.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5190923 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10439/artifact/patchprocess/checkstyle-result-diff.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10439/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10439/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10439/console |


This message was automatically generated.

 Code Cleanup in DFSInputStream
 --

 Key: HDFS-8280
 URL: https://issues.apache.org/jira/browse/HDFS-8280
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Attachments: HDFS-8280.000.patch


 This is some code cleanup separate from HDFS-8272:
 # Avoid duplicated block reader creation code
 # If no new source DN can be found, {{getBestNodeDNAddrPair}} returns null 
 instead of throwing Exception. Whether to throw Exception or not should be 
 determined by {{getBestNodeDNAddrPair}}'s caller.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7397) The conf key dfs.client.read.shortcircuit.streams.cache.size is misleading

2015-04-28 Thread Brahma Reddy Battula (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518554#comment-14518554
 ] 

Brahma Reddy Battula commented on HDFS-7397:


Testcase failures are unrelated to this patch..

 The conf key dfs.client.read.shortcircuit.streams.cache.size is misleading
 

 Key: HDFS-7397
 URL: https://issues.apache.org/jira/browse/HDFS-7397
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Brahma Reddy Battula
Priority: Minor
 Attachments: HDFS-7397-002.patch, HDFS-7397.patch


 For dfs.client.read.shortcircuit.streams.cache.size, is it in MB or KB?  
 Interestingly, it is neither in MB nor KB.  It is the number of shortcircuit 
 streams.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-3107) HDFS truncate

2015-04-28 Thread Neeta Garimella (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518572#comment-14518572
 ] 

Neeta Garimella commented on HDFS-3107:
---

Could someone comment on why truncate is not exposed via FileSystem Class. Are 
we expecting applications using this will be calling DFSClient interface 
directly?  

 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Reporter: Lei Chang
Assignee: Plamen Jeliazkov
 Fix For: 2.7.0

 Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
 HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
 HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
 HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
 HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
 HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8213) DFSClient should use hdfs.client.htrace HTrace configuration prefix rather than hadoop.htrace


[ 
https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518570#comment-14518570
 ] 

Hadoop QA commented on HDFS-8213:
-

(!) The patch artifact directory on has been removed! 
This is a fatal error for test-patch.sh.  Aborting. 
Jenkins (node H4) information at 
https://builds.apache.org/job/PreCommit-HDFS-Build/10444/ may provide some 
hints.

 DFSClient should use hdfs.client.htrace HTrace configuration prefix rather 
 than hadoop.htrace
 -

 Key: HDFS-8213
 URL: https://issues.apache.org/jira/browse/HDFS-8213
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Billie Rinaldi
Assignee: Colin Patrick McCabe
Priority: Critical
 Attachments: HDFS-8213.001.patch, HDFS-8213.002.patch


 DFSClient initializing SpanReceivers is a problem for Accumulo, which manages 
 SpanReceivers through its own configuration.  This results in the same 
 receivers being registered multiple times and spans being delivered more than 
 once.  The documentation says SpanReceiverHost.getInstance should be issued 
 once per process, so there is no expectation that DFSClient should do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8273) FSNamesystem#Delete() should not call logSync() when holding the lock


[ 
https://issues.apache.org/jira/browse/HDFS-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518503#comment-14518503
 ] 

Hadoop QA commented on HDFS-8273:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  15m  6s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:red}-1{color} | tests included |   0m  0s | The patch doesn't appear 
to include any new or modified tests.  Please justify why no new tests are 
needed for this patch. Also please list what manual steps were performed to 
verify this patch. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:red}-1{color} | javac |   7m 43s | The applied patch generated  2  
additional warning messages. |
| {color:green}+1{color} | javadoc |   9m 50s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 24s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:green}+1{color} | checkstyle |   5m 28s | There were no new checkstyle 
issues. |
| {color:green}+1{color} | install |   1m 37s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  9s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 18s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 164m 23s | Tests failed in hadoop-hdfs. |
| | | 211m 37s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.server.namenode.TestFileTruncate |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728929/HDFS-8273.001.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5190923 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10440/artifact/patchprocess/diffJavacWarnings.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10440/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10440/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10440/console |


This message was automatically generated.

 FSNamesystem#Delete() should not call logSync() when holding the lock
 -

 Key: HDFS-8273
 URL: https://issues.apache.org/jira/browse/HDFS-8273
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Jing Zhao
Assignee: Haohui Mai
Priority: Blocker
 Fix For: 2.7.1

 Attachments: HDFS-8273.000.patch, HDFS-8273.001.patch


 HDFS-7573 moves the logSync call inside of the write lock by accident. We 
 should move it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8280) Code Cleanup in DFSInputStream


 [ 
https://issues.apache.org/jira/browse/HDFS-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8280:
-
   Resolution: Fixed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk and branch-2. Thanks [~jingzhao] for the 
contribution.

 Code Cleanup in DFSInputStream
 --

 Key: HDFS-8280
 URL: https://issues.apache.org/jira/browse/HDFS-8280
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 2.8.0

 Attachments: HDFS-8280.000.patch


 This is some code cleanup separate from HDFS-8272:
 # Avoid duplicated block reader creation code
 # If no new source DN can be found, {{getBestNodeDNAddrPair}} returns null 
 instead of throwing Exception. Whether to throw Exception or not should be 
 determined by {{getBestNodeDNAddrPair}}'s caller.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8214) Secondary NN Web UI shows wrong date for Last Checkpoint


[ 
https://issues.apache.org/jira/browse/HDFS-8214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518528#comment-14518528
 ] 

Andrew Wang commented on HDFS-8214:
---

+1 LGTM, thanks Charles. I rekicked Jenkins, should come back clean.

 Secondary NN Web UI shows wrong date for Last Checkpoint
 

 Key: HDFS-8214
 URL: https://issues.apache.org/jira/browse/HDFS-8214
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS, namenode
Affects Versions: 2.7.0
Reporter: Charles Lamb
Assignee: Charles Lamb
 Attachments: HDFS-8214.001.patch, HDFS-8214.002.patch, 
 HDFS-8214.003.patch


 SecondaryNamenode is using Time.monotonicNow() to display Last Checkpoint in 
 the web UI. This causes weird times, generally, just after the epoch, to be 
 displayed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8273) FSNamesystem#Delete() should not call logSync() when holding the lock


[ 
https://issues.apache.org/jira/browse/HDFS-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518596#comment-14518596
 ] 

Hudson commented on HDFS-8273:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7697 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7697/])
HDFS-8273. FSNamesystem#Delete() should not call logSync() when holding the 
lock. Contributed by Haohui Mai. (wheat9: rev 
c79e7f7d997596e0c38ae4cddff2bd0910581c16)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirDeleteOp.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java


 FSNamesystem#Delete() should not call logSync() when holding the lock
 -

 Key: HDFS-8273
 URL: https://issues.apache.org/jira/browse/HDFS-8273
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Jing Zhao
Assignee: Haohui Mai
Priority: Blocker
 Fix For: 2.7.1

 Attachments: HDFS-8273.000.patch, HDFS-8273.001.patch


 HDFS-7573 moves the logSync call inside of the write lock by accident. We 
 should move it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8280) Code Cleanup in DFSInputStream


[ 
https://issues.apache.org/jira/browse/HDFS-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518595#comment-14518595
 ] 

Hudson commented on HDFS-8280:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7697 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7697/])
HDFS-8280. Code Cleanup in DFSInputStream. Contributed by Jing Zhao. (wheat9: 
rev 439614b0c8a3df3d8b7967451c5331a0e034e13a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Code Cleanup in DFSInputStream
 --

 Key: HDFS-8280
 URL: https://issues.apache.org/jira/browse/HDFS-8280
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jing Zhao
Assignee: Jing Zhao
Priority: Minor
 Fix For: 2.8.0

 Attachments: HDFS-8280.000.patch


 This is some code cleanup separate from HDFS-8272:
 # Avoid duplicated block reader creation code
 # If no new source DN can be found, {{getBestNodeDNAddrPair}} returns null 
 instead of throwing Exception. Whether to throw Exception or not should be 
 determined by {{getBestNodeDNAddrPair}}'s caller.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8213) DFSClient should use hdfs.client.htrace HTrace configuration prefix rather than hadoop.htrace

2015-04-28 Thread Masatake Iwasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518598#comment-14518598
 ] 

Masatake Iwasaki commented on HDFS-8213:


Thanks for the update, [~cmccabe]. I'm +1(non-binding) for 002.

bq. Let's do the hdfs-default.xml and other docs stuff later since it's not 
directly related to this

Yeah. I filed HDFS-8284.


 DFSClient should use hdfs.client.htrace HTrace configuration prefix rather 
 than hadoop.htrace
 -

 Key: HDFS-8213
 URL: https://issues.apache.org/jira/browse/HDFS-8213
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Billie Rinaldi
Assignee: Colin Patrick McCabe
Priority: Critical
 Attachments: HDFS-8213.001.patch, HDFS-8213.002.patch


 DFSClient initializing SpanReceivers is a problem for Accumulo, which manages 
 SpanReceivers through its own configuration.  This results in the same 
 receivers being registered multiple times and spans being delivered more than 
 once.  The documentation says SpanReceiverHost.getInstance should be issued 
 once per process, so there is no expectation that DFSClient should do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8232) Missing datanode counters when using Metrics2 sink interface


[ 
https://issues.apache.org/jira/browse/HDFS-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516888#comment-14516888
 ] 

Hudson commented on HDFS-8232:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #177 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/177/])
HDFS-8232. Missing datanode counters when using Metrics2 sink interface. 
Contributed by Anu Engineer. (cnauroth: rev 
feb68cb5470dc3e6c16b6bc1549141613e360601)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetricHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeFSDataSetSink.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/FSDatasetMBean.java


 Missing datanode counters when using Metrics2 sink interface
 

 Key: HDFS-8232
 URL: https://issues.apache.org/jira/browse/HDFS-8232
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: 2.8.0

 Attachments: hdfs-8232.001.patch, hdfs-8232.002.patch


 When using the Metric2 Sink interface none of the counters declared under 
 Dataanode:FSDataSetBean are visible. They are visible if you use JMX or if 
 you do http://host:port/jmx. 
 Expected behavior is that they be part of Sink interface and accessible in 
 the putMetrics call back.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8205) CommandFormat#parse() should not parse option as value of option


[ 
https://issues.apache.org/jira/browse/HDFS-8205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516895#comment-14516895
 ] 

Hudson commented on HDFS-8205:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #177 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/177/])
HDFS-8205. CommandFormat#parse() should not parse option as value of option. 
(Contributed by Peter Shi and Xiaoyu Yao) (arp: rev 
0d5b0143cc003e132ce454415e35d55d46311416)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java
HDFS-8205. Fix CHANGES.txt (arp: rev 6bae5962cd70ac33fe599c50fb2a906830e5d4b2)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 CommandFormat#parse() should not parse option as value of option
 

 Key: HDFS-8205
 URL: https://issues.apache.org/jira/browse/HDFS-8205
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Peter Shi
Assignee: Peter Shi
Priority: Blocker
 Fix For: 2.8.0

 Attachments: HDFS-8205.01.patch, HDFS-8205.02.patch, HDFS-8205.patch


 {code}./hadoop fs -count -q -t -h -v /
QUOTA   REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT   
 FILE_COUNT   CONTENT_SIZE PATHNAME
 15/04/21 15:20:19 INFO hdfs.DFSClient: Sets 
 dfs.client.block.write.replace-datanode-on-failure.replication to 0
 9223372036854775807 9223372036854775763none inf   
 31   13   1230 /{code}
 This blocks query quota by storage type and clear quota by storage type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7613) Block placement policy for erasure coding groups

2015-04-28 Thread Junping Du (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517015#comment-14517015
 ] 

Junping Du commented on HDFS-7613:
--

Thanks [~zhz], the multi-policies implementation here sounds reasonable to me. 
Some quick questions: 
do we want DFS_BLOCK_PLACEMENT_EC_CLASSNAME_DEFAULT to be 
BlockPlacementPolicyEC rather than BlockPlacementPolicyDefault? I didn't check 
details of BlockPlacementPolicyEC, not sure if BlockPlacementPolicyDefault can 
meet all cases that BlockPlacementPolicyEC should be there.
Also, I see BlockPlacementPolicyEC support rack layer only, do we have plan to 
support NodeGroup layer as well? It would be great to make EC can be suitable 
for broader scenarios.

 Block placement policy for erasure coding groups
 

 Key: HDFS-7613
 URL: https://issues.apache.org/jira/browse/HDFS-7613
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Walter Su
 Attachments: HDFS-7613.001.patch


 Blocks in an erasure coding group should be placed in different failure 
 domains -- different DataNodes at the minimum, and different racks ideally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8205) CommandFormat#parse() should not parse option as value of option


[ 
https://issues.apache.org/jira/browse/HDFS-8205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516865#comment-14516865
 ] 

Hudson commented on HDFS-8205:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2109 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2109/])
HDFS-8205. CommandFormat#parse() should not parse option as value of option. 
(Contributed by Peter Shi and Xiaoyu Yao) (arp: rev 
0d5b0143cc003e132ce454415e35d55d46311416)
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java
HDFS-8205. Fix CHANGES.txt (arp: rev 6bae5962cd70ac33fe599c50fb2a906830e5d4b2)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 CommandFormat#parse() should not parse option as value of option
 

 Key: HDFS-8205
 URL: https://issues.apache.org/jira/browse/HDFS-8205
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Peter Shi
Assignee: Peter Shi
Priority: Blocker
 Fix For: 2.8.0

 Attachments: HDFS-8205.01.patch, HDFS-8205.02.patch, HDFS-8205.patch


 {code}./hadoop fs -count -q -t -h -v /
QUOTA   REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT   
 FILE_COUNT   CONTENT_SIZE PATHNAME
 15/04/21 15:20:19 INFO hdfs.DFSClient: Sets 
 dfs.client.block.write.replace-datanode-on-failure.replication to 0
 9223372036854775807 9223372036854775763none inf   
 31   13   1230 /{code}
 This blocks query quota by storage type and clear quota by storage type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8232) Missing datanode counters when using Metrics2 sink interface


[ 
https://issues.apache.org/jira/browse/HDFS-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516858#comment-14516858
 ] 

Hudson commented on HDFS-8232:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #2109 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/2109/])
HDFS-8232. Missing datanode counters when using Metrics2 sink interface. 
Contributed by Anu Engineer. (cnauroth: rev 
feb68cb5470dc3e6c16b6bc1549141613e360601)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/FSDatasetMBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetricHelper.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeFSDataSetSink.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java


 Missing datanode counters when using Metrics2 sink interface
 

 Key: HDFS-8232
 URL: https://issues.apache.org/jira/browse/HDFS-8232
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: 2.8.0

 Attachments: hdfs-8232.001.patch, hdfs-8232.002.patch


 When using the Metric2 Sink interface none of the counters declared under 
 Dataanode:FSDataSetBean are visible. They are visible if you use JMX or if 
 you do http://host:port/jmx. 
 Expected behavior is that they be part of Sink interface and accessible in 
 the putMetrics call back.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8205) CommandFormat#parse() should not parse option as value of option


[ 
https://issues.apache.org/jira/browse/HDFS-8205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516879#comment-14516879
 ] 

Hudson commented on HDFS-8205:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #168 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/168/])
HDFS-8205. CommandFormat#parse() should not parse option as value of option. 
(Contributed by Peter Shi and Xiaoyu Yao) (arp: rev 
0d5b0143cc003e132ce454415e35d55d46311416)
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
HDFS-8205. Fix CHANGES.txt (arp: rev 6bae5962cd70ac33fe599c50fb2a906830e5d4b2)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 CommandFormat#parse() should not parse option as value of option
 

 Key: HDFS-8205
 URL: https://issues.apache.org/jira/browse/HDFS-8205
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Peter Shi
Assignee: Peter Shi
Priority: Blocker
 Fix For: 2.8.0

 Attachments: HDFS-8205.01.patch, HDFS-8205.02.patch, HDFS-8205.patch


 {code}./hadoop fs -count -q -t -h -v /
QUOTA   REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT   
 FILE_COUNT   CONTENT_SIZE PATHNAME
 15/04/21 15:20:19 INFO hdfs.DFSClient: Sets 
 dfs.client.block.write.replace-datanode-on-failure.replication to 0
 9223372036854775807 9223372036854775763none inf   
 31   13   1230 /{code}
 This blocks query quota by storage type and clear quota by storage type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8232) Missing datanode counters when using Metrics2 sink interface


[ 
https://issues.apache.org/jira/browse/HDFS-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516872#comment-14516872
 ] 

Hudson commented on HDFS-8232:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #168 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/168/])
HDFS-8232. Missing datanode counters when using Metrics2 sink interface. 
Contributed by Anu Engineer. (cnauroth: rev 
feb68cb5470dc3e6c16b6bc1549141613e360601)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/FSDatasetMBean.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetricHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeFSDataSetSink.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java


 Missing datanode counters when using Metrics2 sink interface
 

 Key: HDFS-8232
 URL: https://issues.apache.org/jira/browse/HDFS-8232
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: 2.8.0

 Attachments: hdfs-8232.001.patch, hdfs-8232.002.patch


 When using the Metric2 Sink interface none of the counters declared under 
 Dataanode:FSDataSetBean are visible. They are visible if you use JMX or if 
 you do http://host:port/jmx. 
 Expected behavior is that they be part of Sink interface and accessible in 
 the putMetrics call back.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8205) CommandFormat#parse() should not parse option as value of option


[ 
https://issues.apache.org/jira/browse/HDFS-8205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516917#comment-14516917
 ] 

Hudson commented on HDFS-8205:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #911 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/911/])
HDFS-8205. CommandFormat#parse() should not parse option as value of option. 
(Contributed by Peter Shi and Xiaoyu Yao) (arp: rev 
0d5b0143cc003e132ce454415e35d55d46311416)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
HDFS-8205. Fix CHANGES.txt (arp: rev 6bae5962cd70ac33fe599c50fb2a906830e5d4b2)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 CommandFormat#parse() should not parse option as value of option
 

 Key: HDFS-8205
 URL: https://issues.apache.org/jira/browse/HDFS-8205
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Peter Shi
Assignee: Peter Shi
Priority: Blocker
 Fix For: 2.8.0

 Attachments: HDFS-8205.01.patch, HDFS-8205.02.patch, HDFS-8205.patch


 {code}./hadoop fs -count -q -t -h -v /
QUOTA   REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT   
 FILE_COUNT   CONTENT_SIZE PATHNAME
 15/04/21 15:20:19 INFO hdfs.DFSClient: Sets 
 dfs.client.block.write.replace-datanode-on-failure.replication to 0
 9223372036854775807 9223372036854775763none inf   
 31   13   1230 /{code}
 This blocks query quota by storage type and clear quota by storage type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8232) Missing datanode counters when using Metrics2 sink interface


[ 
https://issues.apache.org/jira/browse/HDFS-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14516910#comment-14516910
 ] 

Hudson commented on HDFS-8232:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #911 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/911/])
HDFS-8232. Missing datanode counters when using Metrics2 sink interface. 
Contributed by Anu Engineer. (cnauroth: rev 
feb68cb5470dc3e6c16b6bc1549141613e360601)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetricHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeFSDataSetSink.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/FSDatasetMBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java


 Missing datanode counters when using Metrics2 sink interface
 

 Key: HDFS-8232
 URL: https://issues.apache.org/jira/browse/HDFS-8232
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: 2.8.0

 Attachments: hdfs-8232.001.patch, hdfs-8232.002.patch


 When using the Metric2 Sink interface none of the counters declared under 
 Dataanode:FSDataSetBean are visible. They are visible if you use JMX or if 
 you do http://host:port/jmx. 
 Expected behavior is that they be part of Sink interface and accessible in 
 the putMetrics call back.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7678) Erasure coding: DFSInputStream with decode functionality


[ 
https://issues.apache.org/jira/browse/HDFS-7678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518401#comment-14518401
 ] 

Andrew Wang commented on HDFS-7678:
---

Thanks for the patch Zhe, some nice functionality here. Some review comments:

Nits:

* Extra imports in DFSStripedInputStream
* Some lines longer than 80chars

Rest:

* I see us swallowing InterruptedException which is quite naughty, but a lot of 
other input stream code does the same. It's a code smell, we really should be 
cleaning up and rethrowing the exception. Think about it at least for this 
patch, and we should file a follow-on for trunk and the potentially the rest of 
the EC code.
* waitNextCompletion, shouldn't the read timeout be an overall timeout, not a 
per-task timeout? Users at least want an overall timeout.
* throwing InterruptedException on empty futures is semantically incorrect, why 
not return null?
* waitNextCompletion and its usage seems kind of complicated. Let's think about 
simplifying it.
* Do we actually need missingBlkIndices or the non-success cases? It's the set 
complement of fetchedBlkIndices. Can determine it after.
* If we enforce the overall timeout in fetchBlockByteRange, we can do the 
futures cleanup there too. Pass the delta timeout down to waitNextCompletion. 
This feels better, since it links the timeout case with the timeout cleanup. 
Maybe another wrapper function to encapsulate this, since waitNextCompletion is 
used in two places.
* Comments all over this logic would be good.
* Is it possible to have a 0 rp.getReadLength() ? Precondition check this?
* In general I would prefer to see Precondition checks rather than asserts, 
since asserts are disabled outside of tests
* We always go through a function called fetchExtraBlks... even if we 
successfully got all the blocks we need the first time. No early exit?
* Also seems like we have some code dupe between fetch and fetchExtra, let's 
think about breaking out some shared functions. I wonder if it'd be better to 
do all the fetching first (including parity if necessary), then pass it over to 
a decode function (if necessary).
* found is not used

 Erasure coding: DFSInputStream with decode functionality
 

 Key: HDFS-7678
 URL: https://issues.apache.org/jira/browse/HDFS-7678
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7285
Reporter: Li Bo
Assignee: Zhe Zhang
 Attachments: BlockGroupReader.patch, HDFS-7678-HDFS-7285.002.patch, 
 HDFS-7678.000.patch, HDFS-7678.001.patch


 A block group reader will read data from BlockGroup no matter in striping 
 layout or contiguous layout. The corrupt blocks can be known before 
 reading(told by namenode), or just be found during reading. The block group 
 reader needs to do decoding work when some blocks are found corrupt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7995) Implement chmod in the HDFS Web UI


[ 
https://issues.apache.org/jira/browse/HDFS-7995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518419#comment-14518419
 ] 

Hadoop QA commented on HDFS-7995:
-

\\
\\
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |   0m  0s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | release audit |   0m 14s | The applied patch does 
not increase the total number of release audit warnings. |
| | |   0m 20s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728954/HDFS-7995.02.patch |
| Optional Tests |  |
| git revision | trunk / 5190923 |
| Java | 1.7.0_55 |
| uname | Linux asf906.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10442/console |


This message was automatically generated.

 Implement chmod in the HDFS Web UI
 --

 Key: HDFS-7995
 URL: https://issues.apache.org/jira/browse/HDFS-7995
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Ravi Prakash
Assignee: Ravi Prakash
 Attachments: HDFS-7995.01.patch, HDFS-7995.02.patch


 We should let users change the permissions of files and directories using the 
 HDFS Web UI



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8269) getBlockLocations() does not resolve the .reserved path and generates incorrect edit logs when updating the atime


[ 
https://issues.apache.org/jira/browse/HDFS-8269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518424#comment-14518424
 ] 

Hadoop QA commented on HDFS-8269:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 55s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 2 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 40s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 42s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   4m  2s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 34s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 34s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m 12s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 19s | Pre-build of native portion |
| {color:red}-1{color} | hdfs tests | 163m 53s | Tests failed in hadoop-hdfs. |
| | | 209m 18s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestFileCreationClient |
|   | hadoop.hdfs.server.namenode.TestFsck |
| Timed out tests | org.apache.hadoop.hdfs.server.mover.TestMover |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728915/HDFS-8269.002.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / bc1bd7e |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10434/artifact/patchprocess/checkstyle-result-diff.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10434/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10434/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10434/console |


This message was automatically generated.

 getBlockLocations() does not resolve the .reserved path and generates 
 incorrect edit logs when updating the atime
 -

 Key: HDFS-8269
 URL: https://issues.apache.org/jira/browse/HDFS-8269
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yesha Vora
Assignee: Haohui Mai
Priority: Blocker
 Attachments: HDFS-8269.000.patch, HDFS-8269.001.patch, 
 HDFS-8269.002.patch


 When {{FSNamesystem#getBlockLocations}} updates the access time of the INode, 
 it uses the path passed from the client, which generates incorrect edit logs 
 entries:
 {noformat}
   RECORD
 OPCODEOP_TIMES/OPCODE
 DATA
   TXID5085/TXID
   LENGTH0/LENGTH
   PATH/.reserved/.inodes/18230/PATH
   MTIME-1/MTIME
   ATIME1429908236392/ATIME
 /DATA
   /RECORD
 {noformat}
 Note that the NN does not resolve the {{/.reserved}} path when processing the 
 edit log, therefore it eventually leads to a NPE when loading the edit logs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7847) Modify NNThroughputBenchmark to be able to operate on a remote NameNode

2015-04-28 Thread Charles Lamb (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Lamb updated HDFS-7847:
---
Attachment: HDFS-7847.004.patch

.004 is rebased onto trunk.

 Modify NNThroughputBenchmark to be able to operate on a remote NameNode
 ---

 Key: HDFS-7847
 URL: https://issues.apache.org/jira/browse/HDFS-7847
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: HDFS-7836
Reporter: Colin Patrick McCabe
Assignee: Charles Lamb
 Fix For: HDFS-7836

 Attachments: HDFS-7847.000.patch, HDFS-7847.001.patch, 
 HDFS-7847.002.patch, HDFS-7847.003.patch, HDFS-7847.004.patch, 
 make_blocks.tar.gz


 Modify NNThroughputBenchmark to be able to operate on a NN that is not in 
 process. A followon Jira will modify it some more to allow quantifying native 
 and java heap sizes, and some latency numbers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8273) FSNamesystem#Delete() should not call logSync() when holding the lock


 [ 
https://issues.apache.org/jira/browse/HDFS-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8273:
-
Summary: FSNamesystem#Delete() should not call logSync() when holding the 
lock  (was: logSync() is called inside of write lock for delete op)

 FSNamesystem#Delete() should not call logSync() when holding the lock
 -

 Key: HDFS-8273
 URL: https://issues.apache.org/jira/browse/HDFS-8273
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.7.0
Reporter: Jing Zhao
Assignee: Haohui Mai
Priority: Blocker
 Attachments: HDFS-8273.000.patch, HDFS-8273.001.patch


 HDFS-7573 moves the logSync call inside of the write lock by accident. We 
 should move it out.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8201) Add an end to end test for stripping file writing and reading

2015-04-28 Thread Xinwei Qin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518545#comment-14518545
 ] 

Xinwei Qin  commented on HDFS-8201:
---

[~zhz] I have noticed that JIRA. I think your suggestion is very good.

 Add an end to end test for stripping file writing and reading
 -

 Key: HDFS-8201
 URL: https://issues.apache.org/jira/browse/HDFS-8201
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Xinwei Qin 
 Attachments: HDFS-8201.001.patch


 According to off-line discussion with [~zhz] and [~xinwei], we need to 
 implement an end to end test for stripping file support:
 * Create an EC zone;
 * Create a file in the zone;
 * Write various typical sizes of content to the file, each size maybe a test 
 method;
 * Read the written content back;
 * Compare the written content and read content to ensure it's good;
 The test facility is subject to add more steps for erasure encoding and 
 recovering. Will open separate issue for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HDFS-8277) Safemode enter fails when Standby NameNode is down

2015-04-28 Thread Hari Sekhon (JIRA)

Hari Sekhon created HDFS-8277:
-

 Summary: Safemode enter fails when Standby NameNode is down
 Key: HDFS-8277
 URL: https://issues.apache.org/jira/browse/HDFS-8277
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, HDFS, namenode
Affects Versions: 2.6.0
 Environment: HDP 2.2.0
Reporter: Hari Sekhon


HDFS fails to enter safemode when the Standby NameNode is down (eg. due to 
AMBARI-10536).
{code}hdfs dfsadmin -safemode enter
safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: 
java.net.ConnectException: Connection refused; For more details see:  
http://wiki.apache.org/hadoop/ConnectionRefused{code}
This appears to be a bug in that it's not trying both NameNodes like the 
standard hdfs client code does, and is instead stopping after getting a 
connection refused from nn1 which is down. I verified normal hadoop fs writes 
and reads via cli did work at this time, using nn2. I happened to run this 
command as the hdfs user on nn2 which was the surviving Active NameNode.

After I re-bootstrapped the Standby NN to fix it the command worked as expected 
again.

Hari Sekhon
http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8232) Missing datanode counters when using Metrics2 sink interface


[ 
https://issues.apache.org/jira/browse/HDFS-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517164#comment-14517164
 ] 

Hudson commented on HDFS-8232:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #178 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/178/])
HDFS-8232. Missing datanode counters when using Metrics2 sink interface. 
Contributed by Anu Engineer. (cnauroth: rev 
feb68cb5470dc3e6c16b6bc1549141613e360601)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetricHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeFSDataSetSink.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/FSDatasetMBean.java


 Missing datanode counters when using Metrics2 sink interface
 

 Key: HDFS-8232
 URL: https://issues.apache.org/jira/browse/HDFS-8232
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: 2.8.0

 Attachments: hdfs-8232.001.patch, hdfs-8232.002.patch


 When using the Metric2 Sink interface none of the counters declared under 
 Dataanode:FSDataSetBean are visible. They are visible if you use JMX or if 
 you do http://host:port/jmx. 
 Expected behavior is that they be part of Sink interface and accessible in 
 the putMetrics call back.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8271) NameNode should bind on both IPv6 and IPv4 if running on dual-stack machine and IPv6 enabled

2015-04-28 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517276#comment-14517276
 ] 

Steve Loughran commented on HDFS-8271:
--

Nate -why not create an Über-JIRA to cover the overall problem of Hadoop to 
support IPv6 .. all these things could be grouped underneath. HADOOP-11574 is 
mainly about recognise network problems and provide diagnostics, rather than 
actual IPv6 support

 NameNode should bind on both IPv6 and IPv4 if running on dual-stack machine 
 and IPv6 enabled
 

 Key: HDFS-8271
 URL: https://issues.apache.org/jira/browse/HDFS-8271
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Affects Versions: 2.7.1
Reporter: Nate Edel
Assignee: Nate Edel
  Labels: ipv6

 NameNode works properly on IPv4 or IPv6 single stack (assuming in the latter 
 case that scripts have been changed to disable preferIPv4Stack, and dependent 
 on the client/data node fix in HDFS-8078).  On dual-stack machines, NameNode 
 listens only on IPv4 (even ignoring preferIPv6Addresses being set.)
 Our initial use case for IPv6 is IPv6-only clusters, but ideally we'd support 
 binding to both the IPv4 and IPv6 machine addresses so that we can support 
 heterogenous clusters (some dual-stack and some IPv6-only machines.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8205) CommandFormat#parse() should not parse option as value of option


[ 
https://issues.apache.org/jira/browse/HDFS-8205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517239#comment-14517239
 ] 

Hudson commented on HDFS-8205:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2127 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2127/])
HDFS-8205. CommandFormat#parse() should not parse option as value of option. 
(Contributed by Peter Shi and Xiaoyu Yao) (arp: rev 
0d5b0143cc003e132ce454415e35d55d46311416)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
HDFS-8205. Fix CHANGES.txt (arp: rev 6bae5962cd70ac33fe599c50fb2a906830e5d4b2)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 CommandFormat#parse() should not parse option as value of option
 

 Key: HDFS-8205
 URL: https://issues.apache.org/jira/browse/HDFS-8205
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Peter Shi
Assignee: Peter Shi
Priority: Blocker
 Fix For: 2.8.0

 Attachments: HDFS-8205.01.patch, HDFS-8205.02.patch, HDFS-8205.patch


 {code}./hadoop fs -count -q -t -h -v /
QUOTA   REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT   
 FILE_COUNT   CONTENT_SIZE PATHNAME
 15/04/21 15:20:19 INFO hdfs.DFSClient: Sets 
 dfs.client.block.write.replace-datanode-on-failure.replication to 0
 9223372036854775807 9223372036854775763none inf   
 31   13   1230 /{code}
 This blocks query quota by storage type and clear quota by storage type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8232) Missing datanode counters when using Metrics2 sink interface


[ 
https://issues.apache.org/jira/browse/HDFS-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517232#comment-14517232
 ] 

Hudson commented on HDFS-8232:
--

SUCCESS: Integrated in Hadoop-Mapreduce-trunk #2127 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/2127/])
HDFS-8232. Missing datanode counters when using Metrics2 sink interface. 
Contributed by Anu Engineer. (cnauroth: rev 
feb68cb5470dc3e6c16b6bc1549141613e360601)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/FSDatasetMBean.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeFSDataSetSink.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/metrics/DataNodeMetricHelper.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Missing datanode counters when using Metrics2 sink interface
 

 Key: HDFS-8232
 URL: https://issues.apache.org/jira/browse/HDFS-8232
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.7.1
Reporter: Anu Engineer
Assignee: Anu Engineer
 Fix For: 2.8.0

 Attachments: hdfs-8232.001.patch, hdfs-8232.002.patch


 When using the Metric2 Sink interface none of the counters declared under 
 Dataanode:FSDataSetBean are visible. They are visible if you use JMX or if 
 you do http://host:port/jmx. 
 Expected behavior is that they be part of Sink interface and accessible in 
 the putMetrics call back.
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8205) CommandFormat#parse() should not parse option as value of option


[ 
https://issues.apache.org/jira/browse/HDFS-8205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517171#comment-14517171
 ] 

Hudson commented on HDFS-8205:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #178 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/178/])
HDFS-8205. CommandFormat#parse() should not parse option as value of option. 
(Contributed by Peter Shi and Xiaoyu Yao) (arp: rev 
0d5b0143cc003e132ce454415e35d55d46311416)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/DFSAdmin.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCount.java
* hadoop-hdfs-project/hadoop-hdfs/src/test/resources/testHDFSConf.xml
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandFormat.java
HDFS-8205. Fix CHANGES.txt (arp: rev 6bae5962cd70ac33fe599c50fb2a906830e5d4b2)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 CommandFormat#parse() should not parse option as value of option
 

 Key: HDFS-8205
 URL: https://issues.apache.org/jira/browse/HDFS-8205
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.8.0
Reporter: Peter Shi
Assignee: Peter Shi
Priority: Blocker
 Fix For: 2.8.0

 Attachments: HDFS-8205.01.patch, HDFS-8205.02.patch, HDFS-8205.patch


 {code}./hadoop fs -count -q -t -h -v /
QUOTA   REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT   
 FILE_COUNT   CONTENT_SIZE PATHNAME
 15/04/21 15:20:19 INFO hdfs.DFSClient: Sets 
 dfs.client.block.write.replace-datanode-on-failure.replication to 0
 9223372036854775807 9223372036854775763none inf   
 31   13   1230 /{code}
 This blocks query quota by storage type and clear quota by storage type.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Work started] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-04-28 Thread feng xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-8246 started by feng xu.
-
 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8248) Store INodeId instead of the INodeFile object in BlockInfoContiguous


[ 
https://issues.apache.org/jira/browse/HDFS-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518615#comment-14518615
 ] 

Hadoop QA commented on HDFS-8248:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 31s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 7 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 27s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 34s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   7m 54s | The applied patch generated  4 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 31s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  7s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 21s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests | 165m  2s | Tests passed in hadoop-hdfs. 
|
| | | 213m 29s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728950/HDFS-8248.003.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5190923 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10441/artifact/patchprocess/checkstyle-result-diff.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10441/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10441/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf902.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10441/console |


This message was automatically generated.

 Store INodeId instead of the INodeFile object in BlockInfoContiguous
 

 Key: HDFS-8248
 URL: https://issues.apache.org/jira/browse/HDFS-8248
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-8248.000.patch, HDFS-8248.001.patch, 
 HDFS-8248.002.patch, HDFS-8248.003.patch


 Currently the namespace and the block manager are tightly coupled together. 
 There are two couplings in terms of implementation:
 1. The {{BlockInfoContiguous}} stores a reference of the {{INodeFile}} that 
 owns the block, so that the block manager can look up the corresponding file 
 when replicating blocks, recovering from pipeline failures, etc.
 1. The {{INodeFile}} stores {{BlockInfoContiguous}} objects that the file 
 owns.
 Decoupling the namespace and the block manager allows the BM to be separated 
 out from the Java heap or even as a standalone process. This jira proposes to 
 remove the first coupling by storing the id of the inode instead of the 
 object reference of {{INodeFile}} in the {{BlockInfoContiguous}} class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7281) Missing block is marked as corrupted block

2015-04-28 Thread Ming Ma (JIRA)

[
https://issues.apache.org/jira/browse/HDFS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ming Ma updated HDFS-7281:
--
Release Note:
The patch improves the reporting around missing blocks and corrupted blocks.

1. A block is missing if and only if all DNs of its expected replicas are dead.
2. A block is corrupted if and only if all its available replicas are
corrupted. So if a block has 3 replicas; one of the DN is dead, the other two
replicas are corrupted; it will be marked as corrupted.
3. A new line is added to fsck output to display the corrupt block size per
file.
4. A new line is added to fsck output to display the number of missing blocks
in the summary section.

Missing block is marked as corrupted block
--

Key: HDFS-7281
URL: https://issues.apache.org/jira/browse/HDFS-7281
Project: Hadoop HDFS
Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
Labels: supportability
Attachments: HDFS-7281-2.patch, HDFS-7281-3.patch, HDFS-7281-4.patch,
HDFS-7281.patch

In the situation where the block lost all its replicas, fsck shows the block
is missing as well as corrupted. Perhaps it is better not to mark the block
corrupted in this case. The reason it is marked as corrupted is
numCorruptNodes == numNodes == 0 in the following code.
{noformat}
BlockManager
final boolean isCorrupt = numCorruptNodes == numNodes;
{noformat}
Would like to clarify if it is the intent to mark missing block as corrupted
or it is just a bug.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8283) DataStreamer cleanup and some minor improvement


[ 
https://issues.apache.org/jira/browse/HDFS-8283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518736#comment-14518736
 ] 

Hadoop QA commented on HDFS-8283:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 58s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 1 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 44s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 46s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 21s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   4m  3s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 47s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | common tests |  23m 40s | Tests passed in 
hadoop-common. |
| {color:red}-1{color} | hdfs tests | 164m 29s | Tests failed in hadoop-hdfs. |
| | | 231m 58s | |
\\
\\
|| Reason || Tests ||
| Failed unit tests | hadoop.hdfs.TestAppendSnapshotTruncate |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728994/h8283_20150428.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / c79e7f7 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10445/artifact/patchprocess/checkstyle-result-diff.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10445/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10445/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10445/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf903.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10445/console |


This message was automatically generated.

 DataStreamer cleanup and some minor improvement
 ---

 Key: HDFS-8283
 URL: https://issues.apache.org/jira/browse/HDFS-8283
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Reporter: Tsz Wo Nicholas Sze
Assignee: Tsz Wo Nicholas Sze
Priority: Minor
 Attachments: h8283_20150428.patch


 - When throwing an exception
 -* always set lastException 
 -* always creating a new exception so that it has the new stack trace
 - Add LOG.
 - Add final to isAppend and favoredNodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7348) Erasure Coding: striped block recovery

[
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518630#comment-14518630
]

Yi Liu commented on HDFS-7348:
--

Thanks [~zhz] for the review! The comments are helpful.

{quote}
A the DN level I don't think we need to care about cellSize? Since we always
recover entire blocks, the client-side logic taking care of cells can be
simplified here.
{quote}
Yes, DN don't need to care about cellSize, here actually we just use it as a
read buffer size and it divide {{bytesPerChecksum}}, so it's a bit convenient
for crc calculation.

{quote}
Since recovering multiple missing blocks at once is a pretty rare case, should
we just reconstruct all missing blocks and use DataNode#DataTransfer to push
them out?
Should we save a copy of reconstructed block locally? More space will be used;
but it will avoid re-decoding if push fails.
{quote}
Good question and discussion. The best way to avoid re-decoding if push fails
is to check the packet ack before we discard the decoded result and start next
round decoding. Save a copy locally will increase DataNode burden (i.e, affect
performance, disk space/management, calculate crc multiple time and so on) and
increase management, if we don't check the packet ack, we can't know whether
the recovered block is transfer correctly, if we choose to check the packet
ack, we don't need to save it locally.
As I described in the design above or comments inline code, currently we don't
check the packet ack which is similar as continuous block replication, but EC
recovery is more expensive, we could consider to check the packet ack in
further improvement. I can do it (check packet ack) in separate JIRA, maybe in
phase 2, of course we can discuss more here.

{quote}
I filed HDFS-8282 to move StripedReadResult and waitNextCompletion to
StripedBlockUtil.
{quote}
That's good, I will review that JIRA after it's ready.

{quote}
In foreground recovery we read in parallel to minimize latency. It's an
interesting design question whether we should we do the same in background
recovery. More discussions are needed here.
{quote}
We can discuss more for this point here. I think it's OK and don't see bad
side, if we don't recovery it as soon as possible in DN, the client also need
to do on-line read recovery which may cause more network IO (multiple client).

{quote}
Another option is to read entire blocks and then decode
{quote}
It's big issue for memory, especially there may be multiple stripe block
recovery at the same time. I think we should not do this On the other
hand, the fast way to decode is use native code and utilize CPU instruction as
we planed in the design, I have experience when writing native decryption code
for HDFS encryption at rest feature, we also have a buffer (default 64KB) to
invoke JNI.

{quote}
Maybe we can move getBlock to StripedBlockUtil too; it's a useful util to only
parse the Block. If it sounds good to you I'll move it in HDFS-8282.
{quote}
It's good for me if you move it in HDFS-8282, I think we also need to use it in
future :)

I will fix the {{ArrayList}} initialization in next patch.

Erasure Coding: striped block recovery
--

Key: HDFS-7348
URL: https://issues.apache.org/jira/browse/HDFS-7348
Project: Hadoop HDFS
Issue Type: Sub-task
Components: datanode
Reporter: Kai Zheng
Assignee: Yi Liu
Attachments: ECWorker.java, HDFS-7348.001.patch

This JIRA is to recover one or more missed striped block in the striped block
group.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7980) Incremental BlockReport will dramatically slow down the startup of a namenode

2015-04-28 Thread Walter Su (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518628#comment-14518628
 ] 

Walter Su commented on HDFS-7980:
-

004 patch fixes failed test.
By the way:
1. If a block exists in Full BR, not in IBR. It should be processed by 
{{processFullFullBlockReport(..)}}. ( This is how it was treated since long ago)
2. If a block exists in Full BR, also in IBR. It's ok to be processed by 
{{processFullFullBlockReport(..)}}. ( Because it's well handled by IBR 
processing logic)
3. If a block is not in Full BR, but in IBR. It's created and deleted very 
soon. The block doesn't relate to this issue.
I think the test included in the patch is ok. If you have any idea adding more 
tests, please let me know. 
Thanks.

 Incremental BlockReport will dramatically slow down the startup of  a namenode
 --

 Key: HDFS-7980
 URL: https://issues.apache.org/jira/browse/HDFS-7980
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Hui Zheng
Assignee: Walter Su
 Attachments: HDFS-7980.001.patch, HDFS-7980.002.patch, 
 HDFS-7980.003.patch, HDFS-7980.004.patch


 In the current implementation the datanode will call the 
 reportReceivedDeletedBlocks() method that is a IncrementalBlockReport before 
 calling the bpNamenode.blockReport() method. So in a large(several thousands 
 of datanodes) and busy cluster it will slow down(more than one hour) the 
 startup of namenode. 
 {code}
 ListDatanodeCommand blockReport() throws IOException {
 // send block report if timer has expired.
 final long startTime = now();
 if (startTime - lastBlockReport = dnConf.blockReportInterval) {
   return null;
 }
 final ArrayListDatanodeCommand cmds = new ArrayListDatanodeCommand();
 // Flush any block information that precedes the block report. Otherwise
 // we have a chance that we will miss the delHint information
 // or we will report an RBW replica after the BlockReport already reports
 // a FINALIZED one.
 reportReceivedDeletedBlocks();
 lastDeletedReport = startTime;
 .
 // Send the reports to the NN.
 int numReportsSent = 0;
 int numRPCs = 0;
 boolean success = false;
 long brSendStartTime = now();
 try {
   if (totalBlockCount  dnConf.blockReportSplitThreshold) {
 // Below split threshold, send all reports in a single message.
 DatanodeCommand cmd = bpNamenode.blockReport(
 bpRegistration, bpos.getBlockPoolId(), reports);
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8277) Safemode enter fails when Standby NameNode is down

2015-04-28 Thread Brahma Reddy Battula (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brahma Reddy Battula updated HDFS-8277:
---
Issue Type: Improvement  (was: Bug)

 Safemode enter fails when Standby NameNode is down
 --

 Key: HDFS-8277
 URL: https://issues.apache.org/jira/browse/HDFS-8277
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, HDFS, namenode
Affects Versions: 2.6.0
 Environment: HDP 2.2.0
Reporter: Hari Sekhon
Assignee: surendra singh lilhore
Priority: Minor

 HDFS fails to enter safemode when the Standby NameNode is down (eg. due to 
 AMBARI-10536).
 {code}hdfs dfsadmin -safemode enter
 safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused{code}
 This appears to be a bug in that it's not trying both NameNodes like the 
 standard hdfs client code does, and is instead stopping after getting a 
 connection refused from nn1 which is down. I verified normal hadoop fs writes 
 and reads via cli did work at this time, using nn2. I happened to run this 
 command as the hdfs user on nn2 which was the surviving Active NameNode.
 After I re-bootstrapped the Standby NN to fix it the command worked as 
 expected again.
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-3107) HDFS truncate


[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518648#comment-14518648
 ] 

Yi Liu commented on HDFS-3107:
--

Hi Neeta, please checkout the latest trunk or if you get Hadoop 2.7 version, 
you will find the truncate API exposed via FileSystem there.

 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Reporter: Lei Chang
Assignee: Plamen Jeliazkov
 Fix For: 2.7.0

 Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
 HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
 HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
 HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
 HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
 HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8272) Erasure Coding: simplify the retry logic in DFSStripedInputStream


[ 
https://issues.apache.org/jira/browse/HDFS-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518671#comment-14518671
 ] 

Zhe Zhang commented on HDFS-8272:
-

Thanks Jing for updating the patch! {{blockSeekTo}} looks good to me now. I 
also agree we should get rid of retry in {{readWithStrategy}}.

The retry logic in {{DFSInputStream#readBuffer}} will try connecting to the 
same node:
{code}
/* possibly retry the same node so that transient errors don't
 * result in application level failures (e.g. Datanode could have
 * closed the connection because the client is idle for too long).
 */
 sourceFound = seekToBlockSource(pos);
{code}

I guess we should keep this part?

 Erasure Coding: simplify the retry logic in DFSStripedInputStream
 -

 Key: HDFS-8272
 URL: https://issues.apache.org/jira/browse/HDFS-8272
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: h8272-HDFS-7285.000.patch, h8272-HDFS-7285.001.patch


 Currently in DFSStripedInputStream the retry logic is still the same with 
 DFSInputStream. More specifically, every failed read will try to search for 
 another source node. And an exception is thrown when no new source node can 
 be identified. This logic is not appropriate for EC inputstream and can be 
 simplified.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-8277) Safemode enter fails when Standby NameNode is down

2015-04-28 Thread surendra singh lilhore (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

surendra singh lilhore reassigned HDFS-8277:


Assignee: surendra singh lilhore

 Safemode enter fails when Standby NameNode is down
 --

 Key: HDFS-8277
 URL: https://issues.apache.org/jira/browse/HDFS-8277
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, HDFS, namenode
Affects Versions: 2.6.0
 Environment: HDP 2.2.0
Reporter: Hari Sekhon
Assignee: surendra singh lilhore
Priority: Minor

 HDFS fails to enter safemode when the Standby NameNode is down (eg. due to 
 AMBARI-10536).
 {code}hdfs dfsadmin -safemode enter
 safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused{code}
 This appears to be a bug in that it's not trying both NameNodes like the 
 standard hdfs client code does, and is instead stopping after getting a 
 connection refused from nn1 which is down. I verified normal hadoop fs writes 
 and reads via cli did work at this time, using nn2. I happened to run this 
 command as the hdfs user on nn2 which was the surviving Active NameNode.
 After I re-bootstrapped the Standby NN to fix it the command worked as 
 expected again.
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8277) Safemode enter fails when Standby NameNode is down

2015-04-28 Thread surendra singh lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518688#comment-14518688
 ] 

surendra singh lilhore commented on HDFS-8277:
--

I would like to work on this. I will update the status soon

 Safemode enter fails when Standby NameNode is down
 --

 Key: HDFS-8277
 URL: https://issues.apache.org/jira/browse/HDFS-8277
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, HDFS, namenode
Affects Versions: 2.6.0
 Environment: HDP 2.2.0
Reporter: Hari Sekhon
Priority: Minor

 HDFS fails to enter safemode when the Standby NameNode is down (eg. due to 
 AMBARI-10536).
 {code}hdfs dfsadmin -safemode enter
 safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused{code}
 This appears to be a bug in that it's not trying both NameNodes like the 
 standard hdfs client code does, and is instead stopping after getting a 
 connection refused from nn1 which is down. I verified normal hadoop fs writes 
 and reads via cli did work at this time, using nn2. I happened to run this 
 command as the hdfs user on nn2 which was the surviving Active NameNode.
 After I re-bootstrapped the Standby NN to fix it the command worked as 
 expected again.
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8240) During hdfs client is writing small file, missing block showed in namenode web when ha switch

2015-04-28 Thread Ricky Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518687#comment-14518687
 ] 

Ricky Yang commented on HDFS-8240:
--

Allen,
 This scenario appear many times  ,why do you removed the Fix Version 
'2.4.0' ?

 During hdfs   client is writing  small file, missing block showed in namenode 
 web when ha switch 
 -

 Key: HDFS-8240
 URL: https://issues.apache.org/jira/browse/HDFS-8240
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.4.0
 Environment: Linux 2.6.32-279.el6.x86_64
Reporter: Ricky Yang
Assignee: Ricky Yang
 Attachments: HDFS-8240.txt

   Original Estimate: 216h
  Remaining Estimate: 216h

 Description:
 During writing small file to  hdfs ,  when  active namenode be killed 
   and standby namenode was changed to active namenode ,the namenode web  
 showed that  many missing block exits in hdfs cluster.Unfortunately , faild 
 to read the file  with missing block in hdfs shell.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HDFS-8275) Erasure Coding: Implement batched listing of enrasure coding zones


 [ 
https://issues.apache.org/jira/browse/HDFS-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Liu resolved HDFS-8275.
--
Resolution: Duplicate

Hi Rakesh, this JIRA is duplicated with HDFS-8087.

 Erasure Coding: Implement batched listing of enrasure coding zones
 --

 Key: HDFS-8275
 URL: https://issues.apache.org/jira/browse/HDFS-8275
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R

 The idea of this jira is to provide batch API in {{DistributedFileSystem}} to 
 list the {{ECZoneInfo}}.
 API signature:-
 {code}
   /**
   * List all ErasureCoding zones. Incrementally fetches results from the 
 server.
   */
   public RemoteIteratorECZoneInfo listErasureCodingZones() throws 
 IOException {
 return dfs.listErasureCodingZones();
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-3107) HDFS truncate

2015-04-28 Thread Neeta Garimella (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518741#comment-14518741
 ] 

Neeta Garimella commented on HDFS-3107:
---

Thanks Yi. I will get the latest.

 HDFS truncate
 -

 Key: HDFS-3107
 URL: https://issues.apache.org/jira/browse/HDFS-3107
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: datanode, namenode
Reporter: Lei Chang
Assignee: Plamen Jeliazkov
 Fix For: 2.7.0

 Attachments: HDFS-3107-13.patch, HDFS-3107-14.patch, 
 HDFS-3107-15.patch, HDFS-3107-HDFS-7056-combined.patch, HDFS-3107.008.patch, 
 HDFS-3107.15_branch2.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, HDFS-3107.patch, 
 HDFS-3107.patch, HDFS_truncate.pdf, HDFS_truncate.pdf, HDFS_truncate.pdf, 
 HDFS_truncate.pdf, HDFS_truncate_semantics_Mar15.pdf, 
 HDFS_truncate_semantics_Mar15.pdf, HDFS_truncate_semantics_Mar21.pdf, 
 HDFS_truncate_semantics_Mar21.pdf, editsStored, editsStored.xml

   Original Estimate: 1,344h
  Remaining Estimate: 1,344h

 Systems with transaction support often need to undo changes made to the 
 underlying storage when a transaction is aborted. Currently HDFS does not 
 support truncate (a standard Posix operation) which is a reverse operation of 
 append, which makes upper layer applications use ugly workarounds (such as 
 keeping track of the discarded byte range per file in a separate metadata 
 store, and periodically running a vacuum process to rewrite compacted files) 
 to overcome this limitation of HDFS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-7281) Missing block is marked as corrupted block

2015-04-28 Thread Ming Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-7281:
--
Attachment: HDFS-7281-5.patch

[~yzhangal], good catch. I have updated the patch and the release notes.

 Missing block is marked as corrupted block
 --

 Key: HDFS-7281
 URL: https://issues.apache.org/jira/browse/HDFS-7281
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ming Ma
Assignee: Ming Ma
  Labels: supportability
 Attachments: HDFS-7281-2.patch, HDFS-7281-3.patch, HDFS-7281-4.patch, 
 HDFS-7281-5.patch, HDFS-7281.patch


 In the situation where the block lost all its replicas, fsck shows the block 
 is missing as well as corrupted. Perhaps it is better not to mark the block 
 corrupted in this case. The reason it is marked as corrupted is 
 numCorruptNodes == numNodes == 0 in the following code.
 {noformat}
 BlockManager
 final boolean isCorrupt = numCorruptNodes == numNodes;
 {noformat}
 Would like to clarify if it is the intent to mark missing block as corrupted 
 or it is just a bug.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7758) Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead


[ 
https://issues.apache.org/jira/browse/HDFS-7758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518706#comment-14518706
 ] 

Hadoop QA commented on HDFS-7758:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 38s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 21 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 28s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 37s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   3m 55s | The applied patch generated  1 
 additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 36s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 32s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   3m  4s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | native |   3m 13s | Pre-build of native portion |
| {color:green}+1{color} | hdfs tests | 165m 33s | Tests passed in hadoop-hdfs. 
|
| | | 210m  3s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728992/HDFS-7758.006.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 5190923 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10443/artifact/patchprocess/checkstyle-result-diff.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10443/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10443/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf901.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10443/console |


This message was automatically generated.

 Retire FsDatasetSpi#getVolumes() and use FsDatasetSpi#getVolumeRefs() instead
 -

 Key: HDFS-7758
 URL: https://issues.apache.org/jira/browse/HDFS-7758
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.6.0
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
 Attachments: HDFS-7758.000.patch, HDFS-7758.001.patch, 
 HDFS-7758.002.patch, HDFS-7758.003.patch, HDFS-7758.004.patch, 
 HDFS-7758.005.patch, HDFS-7758.006.patch


 HDFS-7496 introduced reference-counting  the volume instances being used to 
 prevent race condition when hot swapping a volume.
 However, {{FsDatasetSpi#getVolumes()}} can still leak the volume instance 
 without increasing its reference count. In this JIRA, we retire the 
 {{FsDatasetSpi#getVolumes()}} and propose {{FsDatasetSpi#getVolumeRefs()}} 
 and etc. method to access {{FsVolume}}. Thus it makes sure that the consumer 
 of {{FsVolume}} always has correct reference count.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8275) Erasure Coding: Implement batched listing of enrasure coding zones

2015-04-28 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518723#comment-14518723
 ] 

Rakesh R commented on HDFS-8275:


OK, thanks [~hitliuyi] for pointing out this. Since I have done some background 
study, I'm happy to take up this task HDFS-8087. Can you assign it to me if you 
have not yet started:)

 Erasure Coding: Implement batched listing of enrasure coding zones
 --

 Key: HDFS-8275
 URL: https://issues.apache.org/jira/browse/HDFS-8275
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Rakesh R
Assignee: Rakesh R

 The idea of this jira is to provide batch API in {{DistributedFileSystem}} to 
 list the {{ECZoneInfo}}.
 API signature:-
 {code}
   /**
   * List all ErasureCoding zones. Incrementally fetches results from the 
 server.
   */
   public RemoteIteratorECZoneInfo listErasureCodingZones() throws 
 IOException {
 return dfs.listErasureCodingZones();
   }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-04-28 Thread feng xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

feng xu updated HDFS-8246:
--
Flags:   (was: Patch)

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-04-28 Thread feng xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

feng xu updated HDFS-8246:
--
Flags: Patch

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8277) Safemode enter fails when Standby NameNode is down

2015-04-28 Thread Hari Sekhon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hari Sekhon updated HDFS-8277:
--
Priority: Minor  (was: Major)

 Safemode enter fails when Standby NameNode is down
 --

 Key: HDFS-8277
 URL: https://issues.apache.org/jira/browse/HDFS-8277
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, HDFS, namenode
Affects Versions: 2.6.0
 Environment: HDP 2.2.0
Reporter: Hari Sekhon
Priority: Minor

 HDFS fails to enter safemode when the Standby NameNode is down (eg. due to 
 AMBARI-10536).
 {code}hdfs dfsadmin -safemode enter
 safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused{code}
 This appears to be a bug in that it's not trying both NameNodes like the 
 standard hdfs client code does, and is instead stopping after getting a 
 connection refused from nn1 which is down. I verified normal hadoop fs writes 
 and reads via cli did work at this time, using nn2. I happened to run this 
 command as the hdfs user on nn2 which was the surviving Active NameNode.
 After I re-bootstrapped the Standby NN to fix it the command worked as 
 expected again.
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8277) Safemode enter fails when Standby NameNode is down

2015-04-28 Thread Hari Sekhon (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14517333#comment-14517333
 ] 

Hari Sekhon commented on HDFS-8277:
---

Ah, I have both back up now so this command works regardless, won't be a great 
test.

Perhaps this should be labelled improvement instead of bug since other hdfs 
commands do auto-failover for HA setups.

 Safemode enter fails when Standby NameNode is down
 --

 Key: HDFS-8277
 URL: https://issues.apache.org/jira/browse/HDFS-8277
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, HDFS, namenode
Affects Versions: 2.6.0
 Environment: HDP 2.2.0
Reporter: Hari Sekhon

 HDFS fails to enter safemode when the Standby NameNode is down (eg. due to 
 AMBARI-10536).
 {code}hdfs dfsadmin -safemode enter
 safemode: Call From nn2/x.x.x.x to nn1:8020 failed on connection exception: 
 java.net.ConnectException: Connection refused; For more details see:  
 http://wiki.apache.org/hadoop/ConnectionRefused{code}
 This appears to be a bug in that it's not trying both NameNodes like the 
 standard hdfs client code does, and is instead stopping after getting a 
 connection refused from nn1 which is down. I verified normal hadoop fs writes 
 and reads via cli did work at this time, using nn2. I happened to run this 
 command as the hdfs user on nn2 which was the surviving Active NameNode.
 After I re-bootstrapped the Standby NN to fix it the command worked as 
 expected again.
 Hari Sekhon
 http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HDFS-8246) Get HDFS file name based on block pool id and block id


 [ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang reassigned HDFS-8246:
-

Assignee: Andrew Wang  (was: feng xu)

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: Andrew Wang
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HDFS-8246) Get HDFS file name based on block pool id and block id