date:20160713

[jira] [Created] (HDFS-10625) VolumeScanner to report why a block is found bad

2016-07-13 Thread Yongjun Zhang (JIRA)

Yongjun Zhang created HDFS-10625:


 Summary:  VolumeScanner to report why a block is found bad
 Key: HDFS-10625
 URL: https://issues.apache.org/jira/browse/HDFS-10625
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs
Reporter: Yongjun Zhang


VolumeScanner may report:

{code}
WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
blk_1170125248_96458336 on /d/dfs/dn
{code}

It would be helpful to report the reason why the block is bad, especially when 
the block is corrupt, where is the first corrupted chunk in the block.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Deleted] (HDFS-10624) VolumeScanner to report why a block is found bad

2016-07-13 Thread Andrew Wang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang deleted HDFS-10624:
---


> VolumeScanner to report why a block is found bad
> 
>
> Key: HDFS-10624
> URL: https://issues.apache.org/jira/browse/HDFS-10624
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yongjun Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10624) VolumeScanner to report why a block is found bad

2016-07-13 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-10624:
-
Description: (was: Seeing the following on DN log. 

{code}
2016-04-07 20:27:45,416 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 
received exception java.io.EOFException: Premature EOF: no length prefix 
available
2016-04-07 20:27:45,416 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
rn2-lampp-lapp1115.rno.apple.com:1110:DataXceiver error processing WRITE_BLOCK 
operation  src: /10.204.64.137:45112 dst: /10.204.64.151:1110
java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2241)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:738)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
at java.lang.Thread.run(Thread.java:745)
2016-04-07 20:27:46,116 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-1800173197-10.204.68.5-125156296:blk_1170125248_96458336 on 
/ngs8/app/lampp/dfs/dn
2016-04-07 20:27:46,117 ERROR 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: 
VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) 
exiting because of exception
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621)
2016-04-07 20:27:46,118 INFO 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: 
VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) 
exiting.
2016-04-07 20:27:46,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(10.204.64.151, 
datanodeUuid=6064994a-6769-4192-9377-83f78bd3d7a6, infoPort=0, 
infoSecurePort=1175, ipcPort=1120, 
storageInfo=lv=-56;cid=cluster6;nsid=1112595121;c=0):Failed to transfer 
BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 to 
10.204.64.10:1110 got
java.net.SocketException: Original Exception : java.io.IOException: Connection 
reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at 
org.apache.hadoop.security.SaslOutputStream.write(SaslOutputStream.java:190)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:758)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:705)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2154)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.transferReplicaForPipelineRecovery(DataNode.java:2884)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.transferBlock(DataXceiver.java:862)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opTransferBlock(Receiver.java:200)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:118)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Connection

[jira] [Updated] (HDFS-10624) VolumeScanner to report why a block is found bad

2016-07-13 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-10624:
-
Description: 
Seeing the following on DN log. 

{code}
2016-04-07 20:27:45,416 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 
received exception java.io.EOFException: Premature EOF: no length prefix 
available
2016-04-07 20:27:45,416 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
rn2-lampp-lapp1115.rno.apple.com:1110:DataXceiver error processing WRITE_BLOCK 
operation  src: /10.204.64.137:45112 dst: /10.204.64.151:1110
java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2241)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:738)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
at java.lang.Thread.run(Thread.java:745)
2016-04-07 20:27:46,116 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-1800173197-10.204.68.5-125156296:blk_1170125248_96458336 on 
/ngs8/app/lampp/dfs/dn
2016-04-07 20:27:46,117 ERROR 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: 
VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) 
exiting because of exception
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621)
2016-04-07 20:27:46,118 INFO 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: 
VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) 
exiting.
2016-04-07 20:27:46,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(10.204.64.151, 
datanodeUuid=6064994a-6769-4192-9377-83f78bd3d7a6, infoPort=0, 
infoSecurePort=1175, ipcPort=1120, 
storageInfo=lv=-56;cid=cluster6;nsid=1112595121;c=0):Failed to transfer 
BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 to 
10.204.64.10:1110 got
java.net.SocketException: Original Exception : java.io.IOException: Connection 
reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at 
org.apache.hadoop.security.SaslOutputStream.write(SaslOutputStream.java:190)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:758)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:705)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2154)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.transferReplicaForPipelineRecovery(DataNode.java:2884)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.transferBlock(DataXceiver.java:862)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opTransferBlock(Receiver.java:200)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:118)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Connection reset by

[jira] [Created] (HDFS-10624) VolumeScanner to report why a block is found bad

2016-07-13 Thread Yongjun Zhang (JIRA)

Yongjun Zhang created HDFS-10624:


 Summary: VolumeScanner to report why a block is found bad
 Key: HDFS-10624
 URL: https://issues.apache.org/jira/browse/HDFS-10624
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs
Reporter: Yongjun Zhang


Seeing the following on DN log. 

{code}
2016-04-07 20:27:45,416 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
opWriteBlock BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 
received exception java.io.EOFException: Premature EOF: no length prefix 
available
2016-04-07 20:27:45,416 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: 
rn2-lampp-lapp1115.rno.apple.com:1110:DataXceiver error processing WRITE_BLOCK 
operation  src: /10.204.64.137:45112 dst: /10.204.64.151:1110
java.io.EOFException: Premature EOF: no length prefix available
at 
org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2241)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:738)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:244)
at java.lang.Thread.run(Thread.java:745)
2016-04-07 20:27:46,116 WARN 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
BP-1800173197-10.204.68.5-125156296:blk_1170125248_96458336 on 
/ngs8/app/lampp/dfs/dn
2016-04-07 20:27:46,117 ERROR 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: 
VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) 
exiting because of exception
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547)
at 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621)
2016-04-07 20:27:46,118 INFO 
org.apache.hadoop.hdfs.server.datanode.VolumeScanner: 
VolumeScanner(/ngs8/app/lampp/dfs/dn, DS-a14baf2b-a1ef-4282-8d88-3203438e708e) 
exiting.
2016-04-07 20:27:46,442 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
DatanodeRegistration(10.204.64.151, 
datanodeUuid=6064994a-6769-4192-9377-83f78bd3d7a6, infoPort=0, 
infoSecurePort=1175, ipcPort=1120, 
storageInfo=lv=-56;cid=cluster6;nsid=1112595121;c=0):Failed to transfer 
BP-1800173197-10.204.68.5-125156296:blk_1170125248_96465013 to 
10.204.64.10:1110 got
java.net.SocketException: Original Exception : java.io.IOException: Connection 
reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
at 
org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at 
org.apache.hadoop.security.SaslOutputStream.write(SaslOutputStream.java:190)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket(BlockSender.java:585)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.doSendBlock(BlockSender.java:758)
at 
org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:705)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer.run(DataNode.java:2154)
at 
org.apache.hadoop.hdfs.server.datanode.DataNode.transferReplicaForPipelineRecovery(DataNode.java:2884)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.transferBlock(DataXceiver.java:862)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opTransferBlock(Receiver.java:200)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:118)
at

[jira] [Commented] (HDFS-8065) Erasure coding: Support truncate at striped group boundary

2016-07-13 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376304#comment-15376304
 ] 

Rakesh R commented on HDFS-8065:


Hi, [~drankye], [~umamaheswararao], [~zhz]

IMHO, truncate on block group boundary can be supported without much effort 
compare to the partial stripe case, which(latter) will be addressed via 
HDFS-7622 jira. Could you please help me in pushing this jira for 
{{3.0.0-alpha1}} release. Handling of partial stripe logic is needed for lease 
recovery, truncation, hflush cases which I feel together can be implemented 
based on the HDFS-7661 design discussions. Welcome thoughts?


> Erasure coding: Support truncate at striped group boundary
> --
>
> Key: HDFS-8065
> URL: https://issues.apache.org/jira/browse/HDFS-8065
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Rakesh R
> Attachments: HDFS-8065-00.patch, HDFS-8065-01.patch
>
>
> We can support truncate at striped group boundary firstly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-8065) Erasure coding: Support truncate at striped group boundary

2016-07-13 Thread Rakesh R (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-8065?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-8065:
---
Summary: Erasure coding: Support truncate at striped group boundary  (was: 
Erasure coding: Support truncate at striped group boundary.)

> Erasure coding: Support truncate at striped group boundary
> --
>
> Key: HDFS-8065
> URL: https://issues.apache.org/jira/browse/HDFS-8065
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Yi Liu
>Assignee: Rakesh R
> Attachments: HDFS-8065-00.patch, HDFS-8065-01.patch
>
>
> We can support truncate at striped group boundary firstly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized

2016-07-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376257#comment-15376257
 ] 

Hudson commented on HDFS-10617:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #10095 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10095/])
HDFS-10617. PendingReconstructionBlocks.size() should be synchronized. (kihwal: 
rev 2bbc3ea1b54c25c28eb04caa48dece5cfc19d613)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/PendingReconstructionBlocks.java


> PendingReconstructionBlocks.size() should be synchronized
> -
>
> Key: HDFS-10617
> URL: https://issues.apache.org/jira/browse/HDFS-10617
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Eric Badger
>Assignee: Eric Badger
> Fix For: 2.9.0
>
> Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, 
> HDSF-10617-b2.001.patch
>
>
> PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) 
> is a HashMap, which is not a thread-safe data structure. Therefore, the 
> size() function should be synchronized just like the rest of the member 
> functions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10477) Stop decommission a rack of DataNodes caused NameNode fail over to standby

2016-07-13 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376238#comment-15376238
 ] 

Rakesh R commented on HDFS-10477:
-

It looks like test case is failing due to lock release, please check. Secondly, 
when catching and swallowing {{InterruptedException}}, should we call  
{{Thread.currentThread().interrupt()}} afterward, so that the interrupt status 
isn't lost.

{code}
java.lang.IllegalMonitorStateException: null
at 
java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryRelease(ReentrantReadWriteLock.java:371)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1261)
at 
java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.unlock(ReentrantReadWriteLock.java:1131)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeUnlock(FSNamesystem.java:1533)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processExtraRedundancyBlocksOnReCommission(BlockManager.java:3861)
at 
org.apache.hadoop.hdfs.server.blockmanagement.DecommissionManager.stopDecommission(DecommissionManager.java:221)
at 
org.apache.hadoop.hdfs.server.namenode.TestDefaultBlockPlacementPolicy.testPlacementWithLocalRackNodesDecommissioned(TestDefaultBlockPlacementPolicy.java:117)
{code}

> Stop decommission a rack of DataNodes caused NameNode fail over to standby
> --
>
> Key: HDFS-10477
> URL: https://issues.apache.org/jira/browse/HDFS-10477
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.2
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
> Attachments: HDFS-10477.002.patch, HDFS-10477.003.patch, 
> HDFS-10477.004.patch, HDFS-10477.patch
>
>
> In our cluster, when we stop decommissioning a rack which have 46 DataNodes, 
> it locked Namesystem for about 7 minutes as below log shows:
> {code}
> 2016-05-26 20:11:41,697 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.27:1004
> 2016-05-26 20:11:51,171 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 285258 over-replicated blocks on 10.142.27.27:1004 during recommissioning
> 2016-05-26 20:11:51,171 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.118:1004
> 2016-05-26 20:11:59,972 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 279923 over-replicated blocks on 10.142.27.118:1004 during recommissioning
> 2016-05-26 20:11:59,972 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.113:1004
> 2016-05-26 20:12:09,007 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 294307 over-replicated blocks on 10.142.27.113:1004 during recommissioning
> 2016-05-26 20:12:09,008 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.117:1004
> 2016-05-26 20:12:18,055 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 314381 over-replicated blocks on 10.142.27.117:1004 during recommissioning
> 2016-05-26 20:12:18,056 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.130:1004
> 2016-05-26 20:12:25,938 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 272779 over-replicated blocks on 10.142.27.130:1004 during recommissioning
> 2016-05-26 20:12:25,939 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.121:1004
> 2016-05-26 20:12:34,134 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 287248 over-replicated blocks on 10.142.27.121:1004 during recommissioning
> 2016-05-26 20:12:34,134 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.33:1004
> 2016-05-26 20:12:43,020 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 299868 over-replicated blocks on 10.142.27.33:1004 during recommissioning
> 2016-05-26 20:12:43,020 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.137:1004
> 2016-05-26 20:12:52,220 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 303914 over-replicated blocks on 10.142.27.137:1004 during recommissioning
> 2016-05-26 20:12:52,220 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.51:1004
> 2016-05-26 20:13:00,362 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 281175 over-replicated blocks on

[jira] [Created] (HDFS-10623) Remove unused import of httpclient.HttpConnection from TestWebHdfsTokens.

2016-07-13 Thread Jitendra Nath Pandey (JIRA)

Jitendra Nath Pandey created HDFS-10623:
---

 Summary: Remove unused import of httpclient.HttpConnection from 
TestWebHdfsTokens.
 Key: HDFS-10623
 URL: https://issues.apache.org/jira/browse/HDFS-10623
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Reporter: Jitendra Nath Pandey
Assignee: Hanisha Koneru


TestWebHdfsTokens imports httpclient.HttpConnection, and causes unnecessary 
reference to httpclient. This can be removed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized

2016-07-13 Thread Kihwal Lee (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10617:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.9.0
   Status: Resolved  (was: Patch Available)

> PendingReconstructionBlocks.size() should be synchronized
> -
>
> Key: HDFS-10617
> URL: https://issues.apache.org/jira/browse/HDFS-10617
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Eric Badger
>Assignee: Eric Badger
> Fix For: 2.9.0
>
> Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, 
> HDSF-10617-b2.001.patch
>
>
> PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) 
> is a HashMap, which is not a thread-safe data structure. Therefore, the 
> size() function should be synchronized just like the rest of the member 
> functions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10612) Optimize mechanism when block report size exceed the limit of PB message

2016-07-13 Thread Yuanbo Liu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376229#comment-15376229
 ] 

Yuanbo Liu commented on HDFS-10612:
---

Block report size is difficult to calculate, for the block number and block 
report size are not in linear relationship if datanode use blockbuffer. And it 
will bring performance loss to calculate report size when we write block. 
The least we can do is to add warn log, make block report size as a metric and 
add this metric to datanode web ui.

> Optimize mechanism when block report size exceed the limit of PB message
> 
>
> Key: HDFS-10612
> URL: https://issues.apache.org/jira/browse/HDFS-10612
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yuanbo Liu
>
> Community has made block report size configurable in HDFS-10312. But there is 
> still a risk for Hadoop. If block report size exceeds PB message size, the 
> cluster may be in a danger situation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-10612) Optimize mechanism when block report size exceed the limit of PB message

2016-07-13 Thread Yuanbo Liu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu reassigned HDFS-10612:
-

Assignee: Yuanbo Liu

> Optimize mechanism when block report size exceed the limit of PB message
> 
>
> Key: HDFS-10612
> URL: https://issues.apache.org/jira/browse/HDFS-10612
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Yuanbo Liu
>Assignee: Yuanbo Liu
>
> Community has made block report size configurable in HDFS-10312. But there is 
> still a risk for Hadoop. If block report size exceeds PB message size, the 
> cluster may be in a danger situation



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized

2016-07-13 Thread Kihwal Lee (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376227#comment-15376227
 ] 

Kihwal Lee commented on HDFS-10617:
---

+1 lgtm

> PendingReconstructionBlocks.size() should be synchronized
> -
>
> Key: HDFS-10617
> URL: https://issues.apache.org/jira/browse/HDFS-10617
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, 
> HDSF-10617-b2.001.patch
>
>
> PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) 
> is a HashMap, which is not a thread-safe data structure. Therefore, the 
> size() function should be synchronized just like the rest of the member 
> functions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10534) NameNode WebUI should display DataNode usage histogram

2016-07-13 Thread Kai Sasaki (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376186#comment-15376186
 ] 

Kai Sasaki commented on HDFS-10534:
---

[~zhz] Oh, you are next to dust team! Thanks!

> NameNode WebUI should display DataNode usage histogram
> --
>
> Key: HDFS-10534
> URL: https://issues.apache.org/jira/browse/HDFS-10534
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode, ui
>Reporter: Zhe Zhang
>Assignee: Kai Sasaki
> Attachments: HDFS-10534.01.patch, HDFS-10534.02.patch, 
> HDFS-10534.03.patch, HDFS-10534.04.patch, HDFS-10534.05.patch, 
> HDFS-10534.06.patch, HDFS-10534.07.patch, HDFS-10534.08.patch, Screen Shot 
> 2016-06-23 at 6.25.50 AM.png, Screen Shot 2016-07-07 at 23.29.14.png, 
> table_histogram.html
>
>
> In addition of *Min/Median/Max*, another meaningful metric for cluster 
> balance is DN usage in histogram style.
> Since NN already has provided necessary information to calculate histogram of 
> DN usage, it can be done in JS side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10467) Router-based HDFS federation

2016-07-13 Thread Inigo Goiri (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376170#comment-15376170
 ] 

Inigo Goiri commented on HDFS-10467:


After checking the code, I think there might a bunch of overlaps between this 
work and YARN-2915. I'd like to explore what we could move into Hadoop commons 
to manage a federated space. I would probably open a new JIRA for that.

In addition, given the feedback collected during the last few weeks, it seems 
like the community is OK with going into this direction so I'd like to start 
moving the review process forward.
To simplify the review, I propose to convert this JIRA into an umbrella and 
split the current patch into smaller subtasks. For now, I would like to start 
with:
# Minimum Router
# State Store interface
# ZooKeeper State Store implementation

We can add more tasks if people think is the way to do. Probably, it's a good 
idea to create a new branch for this effort. Thoughts? Opinions?

> Router-based HDFS federation
> 
>
> Key: HDFS-10467
> URL: https://issues.apache.org/jira/browse/HDFS-10467
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: fs
>Affects Versions: 2.7.2
>Reporter: Inigo Goiri
> Attachments: HDFS Router Federation.pdf, HDFS-10467.PoC.001.patch, 
> HDFS-10467.PoC.patch, HDFS-Router-Federation-Prototype.patch
>
>
> Add a Router to provide a federated view of multiple HDFS clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10619) Cache path in InodesInPath

2016-07-13 Thread Yiqun Lin (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376158#comment-15376158
 ] 

Yiqun Lin commented on HDFS-10619:
--

Hi, [~daryn], the patch looks good. But the failed test seemed related. I 
tested you patch in my local env. I found sometimes the bytes array {{path}} 
that passed will be like this:
{code}
[null, [102, 111, 111]]
{code}
The pathComponents[0] was null, but the pathComponents[1] has the values, then 
the method {{DFSUtil#byteArray2PathString}} it will throw NPE.
Can we add this logic change to avoid this special case?
{code}
  public static String byteArray2PathString(byte[][] pathComponents,
  int offset, int length) {
if (pathComponents.length == 0) {
  return "";
}
Preconditions.checkArgument(offset >= 0 && offset < pathComponents.length);
Preconditions.checkArgument(length >= 0 && offset + length <=
pathComponents.length);
if (pathComponents.length == 1
&& (pathComponents[0] == null || pathComponents[0].length == 0)) {
  return Path.SEPARATOR;
} else if (pathComponents.length > 1
&& (pathComponents[0] == null || pathComponents[0].length == 0)) {
  // Add this logic
  return "";
}
...
{code}

> Cache path in InodesInPath
> --
>
> Key: HDFS-10619
> URL: https://issues.apache.org/jira/browse/HDFS-10619
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10619.patch
>
>
> INodesInPath#getPath, a frequently called method, dynamically builds the 
> path.  IIP should cache the path upon construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10477) Stop decommission a rack of DataNodes caused NameNode fail over to standby

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376069#comment-15376069
 ] 

Hadoop QA commented on HDFS-10477:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 21s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m 22s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.server.namenode.TestDefaultBlockPlacementPolicy |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817821/HDFS-10477.004.patch |
| JIRA Issue | HDFS-10477 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux b7d6870f248a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d180505 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16052/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16052/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16052/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Stop decommission a rack of DataNodes caused NameNode fail over to standby
> --
>
> Key: HDFS-10477
>

[jira] [Commented] (HDFS-10441) libhdfs++: HA namenode support

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376060#comment-15376060
 ] 

Hadoop QA commented on HDFS-10441:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
42s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
23s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
26s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
15s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
42s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
47s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
54s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with 
JDK v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 49m 18s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817832/HDFS-10441.HDFS-8707.013.patch
 |
| JIRA Issue | HDFS-10441 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux d22cdda6748f 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / d18e396 |
| Default Java | 1.7.0_101 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_91 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 |
| JDK v1.7.0_101  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16053/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16053/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
>

[jira] [Updated] (HDFS-10519) Add a configuration option to enable in-progress edit log tailing

2016-07-13 Thread Jiayi Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiayi Zhou updated HDFS-10519:
--
Attachment: HDFS-10519.007.patch

Add a new parameter isTail to selectInputStream() on the NameNode side and a 
field in RemoteEditLogManifest. When we do in-progress tailing, we'll use 
committedTxnId rather than highestWrittenTxnId. This won't affect other parts 
which also need to select in-progress edits.


> Add a configuration option to enable in-progress edit log tailing
> -
>
> Key: HDFS-10519
> URL: https://issues.apache.org/jira/browse/HDFS-10519
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha
>Reporter: Jiayi Zhou
>Assignee: Jiayi Zhou
>Priority: Minor
> Attachments: HDFS-10519.001.patch, HDFS-10519.002.patch, 
> HDFS-10519.003.patch, HDFS-10519.004.patch, HDFS-10519.005.patch, 
> HDFS-10519.006.patch, HDFS-10519.007.patch
>
>
> Standby Namenode has the option to do in-progress edit log tailing to improve 
> the data freshness. In-progress tailing is already implemented, but it's not 
> enabled as default configuration. And there's no related configuration key to 
> turn it on.
> Adding a related configuration key to let Standby Namenode is reasonable and 
> would be a basis for further improvement on Standby Namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10601) Improve log message to include hostname when the NameNode is in safemode

2016-07-13 Thread Daniel Templeton (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376009#comment-15376009
 ] 

Daniel Templeton commented on HDFS-10601:
-

LGTM.

> Improve log message to include hostname when the NameNode is in safemode
> 
>
> Key: HDFS-10601
> URL: https://issues.apache.org/jira/browse/HDFS-10601
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
>Priority: Minor
> Attachments: HDFS-10601.001.patch, HDFS-10601.002.patch
>
>
> When remote NN operations are involved, it would be nice to have the Namenode 
> hostname in safemode notification log.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10598) DiskBalancer does not execute multi-steps plan.

2016-07-13 Thread Lei (Eddy) Xu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376006#comment-15376006
 ] 

Lei (Eddy) Xu commented on HDFS-10598:
--

Great. Thanks Arpit .

> DiskBalancer does not execute multi-steps plan.
> ---
>
> Key: HDFS-10598
> URL: https://issues.apache.org/jira/browse/HDFS-10598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: diskbalancer
>Affects Versions: 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Critical
> Attachments: HDFS-10598.00.patch
>
>
> I set up a 3 DN node cluster, each one with 2 small disks.  After creating 
> some files to fill HDFS, I added two more small disks to one DN.  And run the 
> diskbalancer on this DataNode.
> The disk usage before running diskbalancer:
> {code}
> /dev/loop0  3.9G  2.1G  1.6G 58%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  17M  3.6G 1%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%  /mnt/data4
> {code}
> However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}})
> {code}
> /dev/loop0  3.9G  1.2G  2.5G 32%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  953M  2.7G 26%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%   /mnt/data4
> {code}
> It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does 
> {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times 
> from {{DiskBalancer#executePlan}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10441) libhdfs++: HA namenode support

2016-07-13 Thread James Clampffer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-10441:
---
Attachment: HDFS-10441.HDFS-8707.013.patch

Thanks for the review [~xiaowei.zhu].

bq. In the for loop, it should be standby_info_ instead of active_info_.
That would have been nasty to debug..

bq. Another small indent problem in the same file
Fixed that too.

[~bobhansen] Would you mind taking another look at this when you get a chance?

> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, 
> HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, 
> HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, 
> HDFS-10441.HDFS-8707.010.patch, HDFS-10441.HDFS-8707.011.patch, 
> HDFS-10441.HDFS-8707.012.patch, HDFS-10441.HDFS-8707.013.patch, 
> HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375952#comment-15375952
 ] 

Hadoop QA commented on HDFS-10544:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 16m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
53s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
5s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
15s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
51s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2976 line(s) that end in whitespace. Use 
git apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  1m 
19s{color} | {color:red} The patch 78 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
7s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m  7s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_101. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
21s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}166m 31s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
| JDK v1.7.0_101 Failed junit tests | 
hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.namenode.TestFileTruncate |
|   | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:c420dfe |
| JIRA Patch URL |

[jira] [Commented] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375945#comment-15375945
 ] 

Hadoop QA commented on HDFS-10620:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 76m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}100m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
|
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817797/HDFS-10620.001.patch |
| JIRA Issue | HDFS-10620 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c860fc399cf4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / af8f480 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16050/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16050/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16050/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> StringBuilder created and appended even if logging is disabled
> --
>
> Key: HDFS-10620
>

[jira] [Created] (HDFS-10622) o.a.h.security.TestGroupsCaching.testBackgroundRefreshCounters seems flaky

2016-07-13 Thread Mingliang Liu (JIRA)

Mingliang Liu created HDFS-10622:


 Summary: 
o.a.h.security.TestGroupsCaching.testBackgroundRefreshCounters seems flaky
 Key: HDFS-10622
 URL: https://issues.apache.org/jira/browse/HDFS-10622
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: security, test
Affects Versions: 2.8.0
Reporter: Mingliang Liu


h5. Error Message

expected:<1> but was:<0>

h5. Stacktrace

java.lang.AssertionError: expected:<1> but was:<0>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at org.junit.Assert.assertEquals(Assert.java:542)
at 
org.apache.hadoop.security.TestGroupsCaching.testBackgroundRefreshCounters(TestGroupsCaching.java:638)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10477) Stop decommission a rack of DataNodes caused NameNode fail over to standby

2016-07-13 Thread yunjiong zhao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yunjiong zhao updated HDFS-10477:
-
Attachment: HDFS-10477.004.patch

Update patch with below changes:
1. release lock after finish process one storage
2. sleep 1 millisecond before try to require lock again

Thanks [~arpiagariu] and [~kihwal].



> Stop decommission a rack of DataNodes caused NameNode fail over to standby
> --
>
> Key: HDFS-10477
> URL: https://issues.apache.org/jira/browse/HDFS-10477
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.7.2
>Reporter: yunjiong zhao
>Assignee: yunjiong zhao
> Attachments: HDFS-10477.002.patch, HDFS-10477.003.patch, 
> HDFS-10477.004.patch, HDFS-10477.patch
>
>
> In our cluster, when we stop decommissioning a rack which have 46 DataNodes, 
> it locked Namesystem for about 7 minutes as below log shows:
> {code}
> 2016-05-26 20:11:41,697 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.27:1004
> 2016-05-26 20:11:51,171 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 285258 over-replicated blocks on 10.142.27.27:1004 during recommissioning
> 2016-05-26 20:11:51,171 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.118:1004
> 2016-05-26 20:11:59,972 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 279923 over-replicated blocks on 10.142.27.118:1004 during recommissioning
> 2016-05-26 20:11:59,972 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.113:1004
> 2016-05-26 20:12:09,007 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 294307 over-replicated blocks on 10.142.27.113:1004 during recommissioning
> 2016-05-26 20:12:09,008 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.117:1004
> 2016-05-26 20:12:18,055 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 314381 over-replicated blocks on 10.142.27.117:1004 during recommissioning
> 2016-05-26 20:12:18,056 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.130:1004
> 2016-05-26 20:12:25,938 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 272779 over-replicated blocks on 10.142.27.130:1004 during recommissioning
> 2016-05-26 20:12:25,939 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.121:1004
> 2016-05-26 20:12:34,134 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 287248 over-replicated blocks on 10.142.27.121:1004 during recommissioning
> 2016-05-26 20:12:34,134 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.33:1004
> 2016-05-26 20:12:43,020 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 299868 over-replicated blocks on 10.142.27.33:1004 during recommissioning
> 2016-05-26 20:12:43,020 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.137:1004
> 2016-05-26 20:12:52,220 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 303914 over-replicated blocks on 10.142.27.137:1004 during recommissioning
> 2016-05-26 20:12:52,220 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.51:1004
> 2016-05-26 20:13:00,362 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 281175 over-replicated blocks on 10.142.27.51:1004 during recommissioning
> 2016-05-26 20:13:00,362 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.12:1004
> 2016-05-26 20:13:08,756 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 274880 over-replicated blocks on 10.142.27.12:1004 during recommissioning
> 2016-05-26 20:13:08,757 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.15:1004
> 2016-05-26 20:13:17,185 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 286334 over-replicated blocks on 10.142.27.15:1004 during recommissioning
> 2016-05-26 20:13:17,185 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager: Stop 
> Decommissioning 10.142.27.14:1004
> 2016-05-26 20:13:25,369 INFO 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Invalidated 
> 280219 over-replicated blocks on 10.142.27.14:1004 during

[jira] [Commented] (HDFS-10587) Incorrect offset/length calculation in pipeline recovery causes block corruption

2016-07-13 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375922#comment-15375922
 ] 

Yongjun Zhang commented on HDFS-10587:
--

The block corruption appears to be corrupted at the very beginning of the chunk 
right after the block transfer (that copy data up to the previous chunk end).

The looks similar to HDFS-4660. Unfortunately we don't have the exact block 
file and checksum file on the source and the destination to compare. Otherwise, 
it would be easier to tell what might have happened.


 

> Incorrect offset/length calculation in pipeline recovery causes block 
> corruption
> 
>
> Key: HDFS-10587
> URL: https://issues.apache.org/jira/browse/HDFS-10587
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10587.001.patch
>
>
> We found incorrect offset and length calculation in pipeline recovery may 
> cause block corruption and results in missing blocks under a very unfortunate 
> scenario. 
> (1) A client established pipeline and started writing data to the pipeline.
> (2) One of the data node in the pipeline restarted, closing the socket, and 
> some written data were unacknowledged.
> (3) Client replaced the failed data node with a new one, initiating block 
> transfer to copy existing data in the block to the new datanode.
> (4) The block is transferred to the new node. Crucially, the entire block, 
> including the unacknowledged data, was transferred.
> (5) The last chunk (512 bytes) was not a full chunk, but the destination 
> still reserved the whole chunk in its buffer, and wrote the entire buffer to 
> disk, therefore some written data is garbage.
> (6) When the transfer was done, the destination data node converted the 
> replica from temporary to rbw, which made its visible length as the length of 
> bytes on disk. That is to say, it thought whatever was transferred was 
> acknowledged. However, the visible length of the replica is different (round 
> up to the next multiple of 512) than the source of transfer. [1]
> (7) Client then truncated the block in the attempt to remove unacknowledged 
> data. However, because the visible length is equivalent of the bytes on disk, 
> it did not truncate unacknowledged data.
> (8) When new data was appended to the destination, it skipped the bytes 
> already on disk. Therefore, whatever was written as garbage was not replaced.
> (9) the volume scanner detected corrupt replica, but due to HDFS-10512, it 
> wouldn’t tell NameNode to mark the replica as corrupt, so the client 
> continued to form a pipeline using the corrupt replica.
> (10) Finally the DN that had the only healthy replica was restarted. NameNode 
> then update the pipeline to only contain the corrupt replica.
> (11) Client continue to write to the corrupt replica, because neither client 
> nor the data node itself knows the replica is corrupt. When the restarted 
> datanodes comes back, their replica are stale, despite they are not corrupt. 
> Therefore, none of the replica is good and up to date.
> The sequence of events was reconstructed based on DataNode/NameNode log and 
> my understanding of code.
> Incidentally, we have observed the same sequence of events on two independent 
> clusters.
> [1]
> The sender has the replica as follows:
> 2016-04-15 22:03:05,066 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_1556997324_1100153495099, RBW
>   getNumBytes() = 41381376
>   getBytesOnDisk()  = 41381376
>   getVisibleLength()= 41186444
>   getVolume()   = /hadoop-i/data/current
>   getBlockFile()= 
> /hadoop-i/data/current/BP-1043567091-10.216.26.120-1343682168507/current/rbw/blk_1556997324
>   bytesAcked=41186444
>   bytesOnDisk=41381376
> while the receiver has the replica as follows:
> 2016-04-15 22:03:05,068 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_1556997324_1100153495099, RBW
>   getNumBytes() = 41186816
>   getBytesOnDisk()  = 41186816
>   getVisibleLength()= 41186816
>   getVolume()   = /hadoop-g/data/current
>   getBlockFile()= 
> /hadoop-g/data/current/BP-1043567091-10.216.26.120-1343682168507/current/rbw/blk_1556997324
>   bytesAcked=41186816
>   bytesOnDisk=41186816



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10441) libhdfs++: HA namenode support

2016-07-13 Thread Xiaowei Zhu (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375865#comment-15375865
 ] 

Xiaowei Zhu commented on HDFS-10441:


The latest patch looks almost good to go with one small typo in rpc_engine.cc:
{code}
bool HANamenodeTracker::IsCurrentStandby_locked(const ::asio::ip::tcp::endpoint 
) const {
  for(unsigned int i=0;i libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, 
> HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, 
> HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, 
> HDFS-10441.HDFS-8707.010.patch, HDFS-10441.HDFS-8707.011.patch, 
> HDFS-10441.HDFS-8707.012.patch, HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10544:
-
Component/s: ha
 balancer & mover

> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, ha
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: 2.8.0, 2.7.3, 2.9.0, 2.6.5, 3.0.0-alpha1
>
> Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, 
> HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, 
> HDFS-10544.04.patch, HDFS-10544.05.patch
>
>
> Right now {{Balancer}} gets the NN URIs through 
> {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. 
> If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to 
> start.
> I think the bug is at {{DFSUtil#getNameServiceUris}}:
> {code}
> for (String nsId : getNameServiceIds(conf)) {
>   if (HAUtil.isHAEnabled(conf, nsId)) {
> // Add the logical URI of the nameservice.
> try {
>   ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId));
> {code}
> Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has 
> {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to 
> resolve the physical URI for this nsId.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10544:
-
Affects Version/s: 2.6.1

> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, ha
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: 2.8.0, 2.7.3, 2.9.0, 2.6.5, 3.0.0-alpha1
>
> Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, 
> HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, 
> HDFS-10544.04.patch, HDFS-10544.05.patch
>
>
> Right now {{Balancer}} gets the NN URIs through 
> {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. 
> If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to 
> start.
> I think the bug is at {{DFSUtil#getNameServiceUris}}:
> {code}
> for (String nsId : getNameServiceIds(conf)) {
>   if (HAUtil.isHAEnabled(conf, nsId)) {
> // Add the logical URI of the nameservice.
> try {
>   ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId));
> {code}
> Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has 
> {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to 
> resolve the physical URI for this nsId.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10544:
-
Target Version/s: 2.6.5  (was: 2.8.0, 2.7.3, 2.6.5, 3.0.0-alpha1)

> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover, ha
>Affects Versions: 2.6.1
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: 2.8.0, 2.7.3, 2.9.0, 2.6.5, 3.0.0-alpha1
>
> Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, 
> HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, 
> HDFS-10544.04.patch, HDFS-10544.05.patch
>
>
> Right now {{Balancer}} gets the NN URIs through 
> {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. 
> If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to 
> start.
> I think the bug is at {{DFSUtil#getNameServiceUris}}:
> {code}
> for (String nsId : getNameServiceIds(conf)) {
>   if (HAUtil.isHAEnabled(conf, nsId)) {
> // Add the logical URI of the nameservice.
> try {
>   ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId));
> {code}
> Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has 
> {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to 
> resolve the physical URI for this nsId.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10598) DiskBalancer does not execute multi-steps plan.

2016-07-13 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10598:
-
Affects Version/s: (was: 2.8.0)

> DiskBalancer does not execute multi-steps plan.
> ---
>
> Key: HDFS-10598
> URL: https://issues.apache.org/jira/browse/HDFS-10598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: diskbalancer
>Affects Versions: 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Critical
> Attachments: HDFS-10598.00.patch
>
>
> I set up a 3 DN node cluster, each one with 2 small disks.  After creating 
> some files to fill HDFS, I added two more small disks to one DN.  And run the 
> diskbalancer on this DataNode.
> The disk usage before running diskbalancer:
> {code}
> /dev/loop0  3.9G  2.1G  1.6G 58%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  17M  3.6G 1%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%  /mnt/data4
> {code}
> However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}})
> {code}
> /dev/loop0  3.9G  1.2G  2.5G 32%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  953M  2.7G 26%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%   /mnt/data4
> {code}
> It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does 
> {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times 
> from {{DiskBalancer#executePlan}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10544:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.6.5
   2.7.3
   Status: Resolved  (was: Patch Available)

I just pushed to branch-2.7 and branch-2.6. Thanks again for the review from 
[~shv]!

> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: 2.8.0, 2.7.3, 2.9.0, 2.6.5, 3.0.0-alpha1
>
> Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, 
> HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, 
> HDFS-10544.04.patch, HDFS-10544.05.patch
>
>
> Right now {{Balancer}} gets the NN URIs through 
> {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. 
> If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to 
> start.
> I think the bug is at {{DFSUtil#getNameServiceUris}}:
> {code}
> for (String nsId : getNameServiceIds(conf)) {
>   if (HAUtil.isHAEnabled(conf, nsId)) {
> // Add the logical URI of the nameservice.
> try {
>   ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId));
> {code}
> Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has 
> {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to 
> resolve the physical URI for this nsId.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10598) DiskBalancer does not execute multi-steps plan.

2016-07-13 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375848#comment-15375848
 ] 

Arpit Agarwal commented on HDFS-10598:
--

Disk Balancer is not in branch-2 so I've set updated the versions accordingly.

> DiskBalancer does not execute multi-steps plan.
> ---
>
> Key: HDFS-10598
> URL: https://issues.apache.org/jira/browse/HDFS-10598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: diskbalancer
>Affects Versions: 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Critical
> Attachments: HDFS-10598.00.patch
>
>
> I set up a 3 DN node cluster, each one with 2 small disks.  After creating 
> some files to fill HDFS, I added two more small disks to one DN.  And run the 
> diskbalancer on this DataNode.
> The disk usage before running diskbalancer:
> {code}
> /dev/loop0  3.9G  2.1G  1.6G 58%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  17M  3.6G 1%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%  /mnt/data4
> {code}
> However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}})
> {code}
> /dev/loop0  3.9G  1.2G  2.5G 32%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  953M  2.7G 26%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%   /mnt/data4
> {code}
> It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does 
> {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times 
> from {{DiskBalancer#executePlan}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10598) DiskBalancer does not execute multi-steps plan.

2016-07-13 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-10598:
-
Target Version/s: 3.0.0-beta1  (was: 2.9.0, 3.0.0-beta1)

> DiskBalancer does not execute multi-steps plan.
> ---
>
> Key: HDFS-10598
> URL: https://issues.apache.org/jira/browse/HDFS-10598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: diskbalancer
>Affects Versions: 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Critical
> Attachments: HDFS-10598.00.patch
>
>
> I set up a 3 DN node cluster, each one with 2 small disks.  After creating 
> some files to fill HDFS, I added two more small disks to one DN.  And run the 
> diskbalancer on this DataNode.
> The disk usage before running diskbalancer:
> {code}
> /dev/loop0  3.9G  2.1G  1.6G 58%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  17M  3.6G 1%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%  /mnt/data4
> {code}
> However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}})
> {code}
> /dev/loop0  3.9G  1.2G  2.5G 32%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  953M  2.7G 26%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%   /mnt/data4
> {code}
> It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does 
> {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times 
> from {{DiskBalancer#executePlan}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10598) DiskBalancer does not execute multi-steps plan.

2016-07-13 Thread Arpit Agarwal (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375844#comment-15375844
 ] 

Arpit Agarwal commented on HDFS-10598:
--

Hi [~eddyxu], thanks for reporting this problem and posting a patch. I believe 
Anu is out on vacation for the next few weeks. I will review your fix.

> DiskBalancer does not execute multi-steps plan.
> ---
>
> Key: HDFS-10598
> URL: https://issues.apache.org/jira/browse/HDFS-10598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: diskbalancer
>Affects Versions: 2.8.0, 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Critical
> Attachments: HDFS-10598.00.patch
>
>
> I set up a 3 DN node cluster, each one with 2 small disks.  After creating 
> some files to fill HDFS, I added two more small disks to one DN.  And run the 
> diskbalancer on this DataNode.
> The disk usage before running diskbalancer:
> {code}
> /dev/loop0  3.9G  2.1G  1.6G 58%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  17M  3.6G 1%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%  /mnt/data4
> {code}
> However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}})
> {code}
> /dev/loop0  3.9G  1.2G  2.5G 32%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  953M  2.7G 26%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%   /mnt/data4
> {code}
> It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does 
> {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times 
> from {{DiskBalancer#executePlan}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10598) DiskBalancer does not execute multi-steps plan.

2016-07-13 Thread Lei (Eddy) Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-10598:
-
 Assignee: Lei (Eddy) Xu  (was: Anu Engineer)
Fix Version/s: (was: 2.9.0)
   Status: Patch Available  (was: Open)

> DiskBalancer does not execute multi-steps plan.
> ---
>
> Key: HDFS-10598
> URL: https://issues.apache.org/jira/browse/HDFS-10598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: diskbalancer
>Affects Versions: 2.8.0, 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Lei (Eddy) Xu
>Priority: Critical
> Attachments: HDFS-10598.00.patch
>
>
> I set up a 3 DN node cluster, each one with 2 small disks.  After creating 
> some files to fill HDFS, I added two more small disks to one DN.  And run the 
> diskbalancer on this DataNode.
> The disk usage before running diskbalancer:
> {code}
> /dev/loop0  3.9G  2.1G  1.6G 58%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  17M  3.6G 1%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%  /mnt/data4
> {code}
> However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}})
> {code}
> /dev/loop0  3.9G  1.2G  2.5G 32%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  953M  2.7G 26%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%   /mnt/data4
> {code}
> It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does 
> {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times 
> from {{DiskBalancer#executePlan}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10598) DiskBalancer does not execute multi-steps plan.

2016-07-13 Thread Lei (Eddy) Xu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-10598:
-
Attachment: HDFS-10598.00.patch

Upload the patch that changes {{DiskBalancerMover#copyBlocks}} to not 
{{setExitFlag}} for normal exit case. And it {{setExitFlag}} from 
{{executePlan()}}. 

However, whether it needs to {{setExitFlag()}} in {{executePlan()}} is unclear 
to me. [~anu] could you give some inputs of the cases it were designed for?

Thanks.

> DiskBalancer does not execute multi-steps plan.
> ---
>
> Key: HDFS-10598
> URL: https://issues.apache.org/jira/browse/HDFS-10598
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: diskbalancer
>Affects Versions: 2.8.0, 3.0.0-beta1
>Reporter: Lei (Eddy) Xu
>Assignee: Anu Engineer
>Priority: Critical
> Fix For: 2.9.0
>
> Attachments: HDFS-10598.00.patch
>
>
> I set up a 3 DN node cluster, each one with 2 small disks.  After creating 
> some files to fill HDFS, I added two more small disks to one DN.  And run the 
> diskbalancer on this DataNode.
> The disk usage before running diskbalancer:
> {code}
> /dev/loop0  3.9G  2.1G  1.6G 58%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  17M  3.6G 1%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%  /mnt/data4
> {code}
> However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}})
> {code}
> /dev/loop0  3.9G  1.2G  2.5G 32%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  953M  2.7G 26%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%   /mnt/data4
> {code}
> It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does 
> {{this.setExitFlag}} which prevents {{copyBlocks()}} be called multiple times 
> from {{DiskBalancer#executePlan}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375793#comment-15375793
 ] 

Zhe Zhang commented on HDFS-10544:
--

Reported test failures on branch-2.7 patch are unrelated and pass locally. 
Committing to branch-2.7 soon.

> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1
>
> Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, 
> HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, 
> HDFS-10544.04.patch, HDFS-10544.05.patch
>
>
> Right now {{Balancer}} gets the NN URIs through 
> {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. 
> If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to 
> start.
> I think the bug is at {{DFSUtil#getNameServiceUris}}:
> {code}
> for (String nsId : getNameServiceIds(conf)) {
>   if (HAUtil.isHAEnabled(conf, nsId)) {
> // Add the logical URI of the nameservice.
> try {
>   ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId));
> {code}
> Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has 
> {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to 
> resolve the physical URI for this nsId.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-13 Thread Staffan Friberg (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375756#comment-15375756
 ] 

Staffan Friberg commented on HDFS-10620:


To avoid all allocation.

{noformat}
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
index 1a76e09..349b018 100644
--- 
a/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
+++ 
b/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
@@ -1319,7 +1319,8 @@ private void addToInvalidates(BlockInfo storedBlock) {
 if (!isPopulatingReplQueues()) {
   return;
 }
-StringBuilder datanodes = new StringBuilder();
+StringBuilder datanodes = blockLog.isDebugEnabled()
+? new StringBuilder() : null;
 for (DatanodeStorageInfo storage : blocksMap.getStorages(storedBlock)) {
   if (storage.getState() != State.NORMAL) {
 continue;
@@ -1328,10 +1329,12 @@ private void addToInvalidates(BlockInfo storedBlock) {
   final Block b = getBlockOnStorage(storedBlock, storage);
   if (b != null) {
 invalidateBlocks.add(b, node, false);
-datanodes.append(node).append(" ");
+if (datanodes != null) {
+  datanodes.append(node).append(" ");
+}
   }
 }
-if (datanodes.length() != 0) {
+if (datanodes != null && datanodes.length() != 0) {
   blockLog.debug("BLOCK* addToInvalidates: {} {}", storedBlock, datanodes);
 }
   }
{noformat}


> StringBuilder created and appended even if logging is disabled
> --
>
> Key: HDFS-10620
> URL: https://issues.apache.org/jira/browse/HDFS-10620
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.4
>Reporter: Staffan Friberg
> Attachments: HDFS-10620.001.patch
>
>
> In BlockManager.addToInvalidates the StringBuilder is appended to during the 
> delete even if logging isn't active.
> Could avoid allocating the StringBuilder as well, but not sure if it is 
> really worth it to add null handling in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-13 Thread Staffan Friberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-10620:
---
Fix Version/s: (was: 3.0.0-alpha1)

> StringBuilder created and appended even if logging is disabled
> --
>
> Key: HDFS-10620
> URL: https://issues.apache.org/jira/browse/HDFS-10620
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.4
>Reporter: Staffan Friberg
> Attachments: HDFS-10620.001.patch
>
>
> In BlockManager.addToInvalidates the StringBuilder is appended to during the 
> delete even if logging isn't active.
> Could avoid allocating the StringBuilder as well, but not sure if it is 
> really worth it to add null handling in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-13 Thread Staffan Friberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-10620:
---
Attachment: HDFS-10620.001.patch

> StringBuilder created and appended even if logging is disabled
> --
>
> Key: HDFS-10620
> URL: https://issues.apache.org/jira/browse/HDFS-10620
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.4
>Reporter: Staffan Friberg
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10620.001.patch
>
>
> In BlockManager.addToInvalidates the StringBuilder is appended to during the 
> delete even if logging isn't active.
> Could avoid allocating the StringBuilder as well, but not sure if it is 
> really worth it to add null handling in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-13 Thread Staffan Friberg (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Staffan Friberg updated HDFS-10620:
---
Fix Version/s: 3.0.0-alpha1
   Status: Patch Available  (was: Open)

> StringBuilder created and appended even if logging is disabled
> --
>
> Key: HDFS-10620
> URL: https://issues.apache.org/jira/browse/HDFS-10620
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.4
>Reporter: Staffan Friberg
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10620.001.patch
>
>
> In BlockManager.addToInvalidates the StringBuilder is appended to during the 
> delete even if logging isn't active.
> Could avoid allocating the StringBuilder as well, but not sure if it is 
> really worth it to add null handling in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-10621) libhdfs++: Implement . (dot) and .. (double-dot) semantics

2016-07-13 Thread Anatoli Shein (JIRA)

Anatoli Shein created HDFS-10621:


 Summary: libhdfs++: Implement . (dot) and .. (double-dot) semantics
 Key: HDFS-10621
 URL: https://issues.apache.org/jira/browse/HDFS-10621
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Anatoli Shein


We need to implement . (dot) and .. (double-dot) semantics in hdfs.cc in 
getAbsolutePath, hdfsSetWorkingDirectory, hdfsGetWorkingDirectory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-10620) StringBuilder created and appended even if logging is disabled

2016-07-13 Thread Staffan Friberg (JIRA)

Staffan Friberg created HDFS-10620:
--

 Summary: StringBuilder created and appended even if logging is 
disabled
 Key: HDFS-10620
 URL: https://issues.apache.org/jira/browse/HDFS-10620
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.4
Reporter: Staffan Friberg


In BlockManager.addToInvalidates the StringBuilder is appended to during the 
delete even if logging isn't active.

Could avoid allocating the StringBuilder as well, but not sure if it is really 
worth it to add null handling in the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375723#comment-15375723
 ] 

Hadoop QA commented on HDFS-10544:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 
28s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
12s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
16s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
8s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2976 line(s) that end in whitespace. Use 
git apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  1m 
17s{color} | {color:red} The patch 78 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 44m 52s{color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_101. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
20s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}133m 23s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | hadoop.hdfs.TestLeaseRecovery2 |
|   | hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
|   | hadoop.hdfs.server.namenode.TestNNThroughputBenchmark |
| JDK v1.7.0_101 Failed junit tests | 
hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots |
|   | hadoop.tools.TestJMXGet |
|   | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:c420dfe |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817775/HDFS-10544-branch-2.7.patch
 |
| JIRA Issue |

[jira] [Commented] (HDFS-10519) Add a configuration option to enable in-progress edit log tailing

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375635#comment-15375635
 ] 

Hadoop QA commented on HDFS-10519:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 61m  
8s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 81m 23s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817776/HDFS-10519.006.patch |
| JIRA Issue | HDFS-10519 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  xml  |
| uname | Linux 73c8744b0f21 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / eb47163 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16048/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16048/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Add a configuration option to enable in-progress edit log tailing
> -
>
> Key: HDFS-10519
> URL: https://issues.apache.org/jira/browse/HDFS-10519
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha
>Reporter: Jiayi Zhou
>Assignee: Jiayi Zhou
>Priority: Minor
> Attachments: HDFS-10519.001.patch,

[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375617#comment-15375617
 ] 

Hadoop QA commented on HDFS-10301:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 368 unchanged - 12 fixed = 370 total (was 380) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 58s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 82m 35s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.metrics2.sink.TestRollingFileSystemSinkWithHdfs |
|   | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.TestFileChecksum |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817774/HDFS-10301.008.patch |
| JIRA Issue | HDFS-10301 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux f991214b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / eb47163 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16045/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16045/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16045/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16045/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> BlockReport retransmissions may lead to storages falsely being

[jira] [Commented] (HDFS-10619) Cache path in InodesInPath

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375596#comment-15375596
 ] 

Hadoop QA commented on HDFS-10619:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
24s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 74m 36s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 94m 41s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestSnapshotCommands |
|   | hadoop.fs.contract.hdfs.TestHDFSContractRootDirectory |
|   | hadoop.cli.TestHDFSCLI |
|   | hadoop.hdfs.TestDFSUpgradeFromImage |
|   | hadoop.hdfs.server.namenode.TestGetBlockLocations |
|   | hadoop.hdfs.TestHDFSFileSystemContract |
|   | hadoop.hdfs.server.namenode.TestFsck |
|   | hadoop.hdfs.TestDatanodeStartupFixesLegacyStorageIDs |
|   | hadoop.hdfs.TestReservedRawPaths |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.TestDatanodeLayoutUpgrade |
|   | hadoop.hdfs.server.mover.TestStorageMover |
|   | hadoop.hdfs.TestClientReportBadBlock |
|   | hadoop.hdfs.web.TestWebHdfsFileSystemContract |
|   | hadoop.hdfs.TestDFSShell |
|   | hadoop.hdfs.server.namenode.ha.TestHAFsck |
|   | hadoop.fs.viewfs.TestViewFileSystemAtHdfsRoot |
|   | hadoop.cli.TestCryptoAdminCLI |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
|   | hadoop.hdfs.TestErasureCodingPolicies |
|   | hadoop.hdfs.TestEncryptionZones |
|   | hadoop.fs.viewfs.TestViewFsAtHdfsRoot |
|   | hadoop.fs.permission.TestStickyBit |
|   | hadoop.hdfs.TestEncryptionZonesWithKMS |
|   | hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot |
|   | hadoop.fs.TestGlobPaths |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817765/HDFS-10619.patch |
| JIRA Issue | HDFS-10619 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite

[jira] [Commented] (HDFS-10619) Cache path in InodesInPath

2016-07-13 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375559#comment-15375559
 ] 

Zhe Zhang commented on HDFS-10619:
--

This looks a good fix. Thanks Daryn. 

If an iip is created but {{getPath}} is never called, we are increasing the 
memory usage by 1 {{String}}. But I think this is pretty rare so overall the 
fix is an improvement.

Thoughts from others? I'll hold off a +1 till the end of today (because of the 
above tradeoff).

> Cache path in InodesInPath
> --
>
> Key: HDFS-10619
> URL: https://issues.apache.org/jira/browse/HDFS-10619
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10619.patch
>
>
> INodesInPath#getPath, a frequently called method, dynamically builds the 
> path.  IIP should cache the path upon construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-07-13 Thread Vinitha Reddy Gankidi (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375518#comment-15375518
 ] 

Vinitha Reddy Gankidi commented on HDFS-10301:
--

I apologize for attaching a wrong patch. Thanks for pointing it out [~cmccabe]. 
I uploaded the correct patch now (008) that calls the isStorageReport method. 
Adding an optional list of storage ID strings in the .proto file would add more 
overhead since these optional parameters would have to be sent with default 
values for all other block report RPCs in addition to the last RPC of the block 
report. I can add more comments in the code to explain what's going on. 
Thoughts?

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> 
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Assignee: Vinitha Reddy Gankidi
>Priority: Critical
> Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, 
> HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.01.patch, 
> HDFS-10301.sample.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10519) Add a configuration option to enable in-progress edit log tailing

2016-07-13 Thread Jiayi Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiayi Zhou updated HDFS-10519:
--
Attachment: HDFS-10519.006.patch

Also add a boolean flag in Journal for the same purpose.

> Add a configuration option to enable in-progress edit log tailing
> -
>
> Key: HDFS-10519
> URL: https://issues.apache.org/jira/browse/HDFS-10519
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha
>Reporter: Jiayi Zhou
>Assignee: Jiayi Zhou
>Priority: Minor
> Attachments: HDFS-10519.001.patch, HDFS-10519.002.patch, 
> HDFS-10519.003.patch, HDFS-10519.004.patch, HDFS-10519.005.patch, 
> HDFS-10519.006.patch
>
>
> Standby Namenode has the option to do in-progress edit log tailing to improve 
> the data freshness. In-progress tailing is already implemented, but it's not 
> enabled as default configuration. And there's no related configuration key to 
> turn it on.
> Adding a related configuration key to let Standby Namenode is reasonable and 
> would be a basis for further improvement on Standby Namenode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10544:
-
Attachment: HDFS-10544-branch-2.7.patch

Attaching branch-2.7 patch to trigger Jenkins.

> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1
>
> Attachments: HDFS-10544-branch-2.7.patch, HDFS-10544.00.patch, 
> HDFS-10544.01.patch, HDFS-10544.02.patch, HDFS-10544.03.patch, 
> HDFS-10544.04.patch, HDFS-10544.05.patch
>
>
> Right now {{Balancer}} gets the NN URIs through 
> {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. 
> If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to 
> start.
> I think the bug is at {{DFSUtil#getNameServiceUris}}:
> {code}
> for (String nsId : getNameServiceIds(conf)) {
>   if (HAUtil.isHAEnabled(conf, nsId)) {
> // Add the logical URI of the nameservice.
> try {
>   ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId));
> {code}
> Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has 
> {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to 
> resolve the physical URI for this nsId.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10301) BlockReport retransmissions may lead to storages falsely being declared zombie if storage report processing happens out of order

2016-07-13 Thread Vinitha Reddy Gankidi (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinitha Reddy Gankidi updated HDFS-10301:
-
Attachment: HDFS-10301.008.patch

> BlockReport retransmissions may lead to storages falsely being declared 
> zombie if storage report processing happens out of order
> 
>
> Key: HDFS-10301
> URL: https://issues.apache.org/jira/browse/HDFS-10301
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.6.1
>Reporter: Konstantin Shvachko
>Assignee: Vinitha Reddy Gankidi
>Priority: Critical
> Attachments: HDFS-10301.002.patch, HDFS-10301.003.patch, 
> HDFS-10301.004.patch, HDFS-10301.005.patch, HDFS-10301.006.patch, 
> HDFS-10301.007.patch, HDFS-10301.008.patch, HDFS-10301.01.patch, 
> HDFS-10301.sample.patch, zombieStorageLogs.rtf
>
>
> When NameNode is busy a DataNode can timeout sending a block report. Then it 
> sends the block report again. Then NameNode while process these two reports 
> at the same time can interleave processing storages from different reports. 
> This screws up the blockReportId field, which makes NameNode think that some 
> storages are zombie. Replicas from zombie storages are immediately removed, 
> causing missing blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10619) Cache path in InodesInPath

2016-07-13 Thread Daryn Sharp (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-10619:
---
Attachment: HDFS-10619.patch

> Cache path in InodesInPath
> --
>
> Key: HDFS-10619
> URL: https://issues.apache.org/jira/browse/HDFS-10619
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10619.patch
>
>
> INodesInPath#getPath, a frequently called method, dynamically builds the 
> path.  IIP should cache the path upon construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10619) Cache path in InodesInPath

2016-07-13 Thread Daryn Sharp (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-10619:
---
Status: Patch Available  (was: Open)

> Cache path in InodesInPath
> --
>
> Key: HDFS-10619
> URL: https://issues.apache.org/jira/browse/HDFS-10619
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10619.patch
>
>
> INodesInPath#getPath, a frequently called method, dynamically builds the 
> path.  IIP should cache the path upon construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-10619) Cache path in InodesInPath

2016-07-13 Thread Daryn Sharp (JIRA)

Daryn Sharp created HDFS-10619:
--

 Summary: Cache path in InodesInPath
 Key: HDFS-10619
 URL: https://issues.apache.org/jira/browse/HDFS-10619
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: hdfs
Reporter: Daryn Sharp
Assignee: Daryn Sharp


INodesInPath#getPath, a frequently called method, dynamically builds the path.  
IIP should cache the path upon construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10616) Improve performance of path handling

2016-07-13 Thread Daryn Sharp (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375423#comment-15375423
 ] 

Daryn Sharp commented on HDFS-10616:


Will be an umbrella for sub-tasks to incrementally integrate large internal 
patches.  In combination with other internal changes (forthcoming IPC 
optimizations, other object allocation reductions), heap growth has 
dramatically slowed.

> Improve performance of path handling
> 
>
> Key: HDFS-10616
> URL: https://issues.apache.org/jira/browse/HDFS-10616
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>
> Path handling in the namesystem and directory is very inefficient.  The path 
> is repeatedly resolved, decomposed into path components, recombined to a full 
> path. parsed again, throughout the system.  This is directly inefficient for 
> general performance, and indirectly via unnecessary pressure on young gen GC.
> The namesystem should only operate on paths, parse it once into inodes, and 
> the directory should only operate on inodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10441) libhdfs++: HA namenode support

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375380#comment-15375380
 ] 

Hadoop QA commented on HDFS-10441:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
12s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
31s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
22s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
14s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
37s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
37s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  5m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
10s{color} | {color:green} hadoop-hdfs-native-client in the patch passed with 
JDK v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817724/HDFS-10441.HDFS-8707.012.patch
 |
| JIRA Issue | HDFS-10441 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux 25e8ac932b81 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / d18e396 |
| Default Java | 1.7.0_101 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_91 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 |
| JDK v1.7.0_101  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16043/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-native-client U: 
hadoop-hdfs-project/hadoop-hdfs-native-client |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16043/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
>

[jira] [Commented] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375352#comment-15375352
 ] 

Hadoop QA commented on HDFS-10617:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
 0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 60m  
0s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 81m 13s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817696/HDFS-10617.002.patch |
| JIRA Issue | HDFS-10617 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux af8fbc3ab72c 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d6d41e8 |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16041/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16041/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> PendingReconstructionBlocks.size() should be synchronized
> -
>
> Key: HDFS-10617
> URL: https://issues.apache.org/jira/browse/HDFS-10617
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, 
> HDSF-10617-b2.001.patch
>
>
>

[jira] [Updated] (HDFS-10441) libhdfs++: HA namenode support

2016-07-13 Thread James Clampffer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-10441:
---
Attachment: HDFS-10441.HDFS-8707.012.patch

Just rebasing the patch since HDFS-9890 was committed to HDFS-8707.

> libhdfs++: HA namenode support
> --
>
> Key: HDFS-10441
> URL: https://issues.apache.org/jira/browse/HDFS-10441
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: James Clampffer
> Attachments: HDFS-10441.HDFS-8707.000.patch, 
> HDFS-10441.HDFS-8707.002.patch, HDFS-10441.HDFS-8707.003.patch, 
> HDFS-10441.HDFS-8707.004.patch, HDFS-10441.HDFS-8707.005.patch, 
> HDFS-10441.HDFS-8707.006.patch, HDFS-10441.HDFS-8707.007.patch, 
> HDFS-10441.HDFS-8707.008.patch, HDFS-10441.HDFS-8707.009.patch, 
> HDFS-10441.HDFS-8707.010.patch, HDFS-10441.HDFS-8707.011.patch, 
> HDFS-10441.HDFS-8707.012.patch, HDFS-8707.HDFS-10441.001.patch
>
>
> If a cluster is HA enabled then do proper failover.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375291#comment-15375291
 ] 

Hadoop QA commented on HDFS-10618:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HDFS-10618 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817707/HDFS-10618-b2.001.patch
 |
| JIRA Issue | HDFS-10618 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16042/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestPendingReconstruction#testPendingAndInvalidate is flaky due to race 
> condition
> -
>
> Key: HDFS-10618
> URL: https://issues.apache.org/jira/browse/HDFS-10618
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10618-b2.001.patch, HDFS-10618.001.patch
>
>
> TestPendingReconstruction#testPendingAndInvalidate fails intermittently. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition

2016-07-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated HDFS-10618:
---
Status: Patch Available  (was: Open)

> TestPendingReconstruction#testPendingAndInvalidate is flaky due to race 
> condition
> -
>
> Key: HDFS-10618
> URL: https://issues.apache.org/jira/browse/HDFS-10618
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10618-b2.001.patch, HDFS-10618.001.patch
>
>
> TestPendingReconstruction#testPendingAndInvalidate fails intermittently. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition

2016-07-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated HDFS-10618:
---
Attachment: HDFS-10618-b2.001.patch

Attaching a branch-2 patch, since reconstruction is known as replication in 
branch-2 and below

> TestPendingReconstruction#testPendingAndInvalidate is flaky due to race 
> condition
> -
>
> Key: HDFS-10618
> URL: https://issues.apache.org/jira/browse/HDFS-10618
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10618-b2.001.patch, HDFS-10618.001.patch
>
>
> TestPendingReconstruction#testPendingAndInvalidate fails intermittently. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition

2016-07-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated HDFS-10618:
---
Attachment: HDFS-10618.001.patch

Attaching a patch that fixes the race in the test by putting all of the 
pertinent test functionality inside of the write lock. This will prevent the 
Replication Monitor from running while the test is corrupting and placing 
blocks in their respective reconstruction structures. 

> TestPendingReconstruction#testPendingAndInvalidate is flaky due to race 
> condition
> -
>
> Key: HDFS-10618
> URL: https://issues.apache.org/jira/browse/HDFS-10618
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10618.001.patch
>
>
> TestPendingReconstruction#testPendingAndInvalidate fails intermittently. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition

2016-07-13 Thread Eric Badger (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375264#comment-15375264
 ] 

Eric Badger commented on HDFS-10618:


Inside of the Replication Monitor, 
BlockManager.computeReconstructionWorkForBlocks() removes blocks from 
neededReconstruction, then computes locations for those blocks to be 
replicated, and then places them into pendingReconstruction. However, before 
computing the locations the write lock is released (and reacquired to add to 
pendingReconstruction). testPendingAndInvalidate can expose this race condition 
because it also indirectly calls 
BlockManager.computeReconstructionWorkForBlocks. The following scenario 
outlines how this test can fail:

1. ReplicationMonitor calls computeReconstructionWorkForBlocks, removes blocks 
from neededReconstruction, releases the write lock, and takes time computing 
the locations for replication
2. testPendingAndInvalidate calls computeReconstructionWorkForBlocks, sees 
nothing in neededReconstruction, spends 0 time computing locations, adds 
nothing to pendingReconstruction, and returns. 
3. testPendingAndInvalidate calls updateState() and indirectly sets 
pendingReconstructionBlocksCount to the current value of pendingReconstruction 
(which is 0, since the Replication Monitor is still computing the block 
locations and hasn't yet added the blocks to pendingReconstruction).
3. testPendingAndInvalidate checks the value of 
pendingReconstructionBlocksCount via getPendingReconstructionBlocksCount() and 
sees that it is 0, causing the associated assert to fail.

It is unclear to me whether or not this failure can happen outside of this 
test, since it is explicitly calling computeReconstructionWorkForBlocks, which 
is normally only called by the Replication Monitor. 

> TestPendingReconstruction#testPendingAndInvalidate is flaky due to race 
> condition
> -
>
> Key: HDFS-10618
> URL: https://issues.apache.org/jira/browse/HDFS-10618
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> TestPendingReconstruction#testPendingAndInvalidate fails intermittently. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-10618) TestPendingReconstruction#testPendingAndInvalidate is flaky due to race condition

2016-07-13 Thread Eric Badger (JIRA)

Eric Badger created HDFS-10618:
--

 Summary: TestPendingReconstruction#testPendingAndInvalidate is 
flaky due to race condition
 Key: HDFS-10618
 URL: https://issues.apache.org/jira/browse/HDFS-10618
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.3-alpha
Reporter: Eric Badger
Assignee: Eric Badger


TestPendingReconstruction#testPendingAndInvalidate fails intermittently. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-9890) libhdfs++: Add test suite to simulate network issues

2016-07-13 Thread James Clampffer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-9890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9890:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the work [~xiaowei.zhu], I just committed this to HDFS-8707.

> libhdfs++: Add test suite to simulate network issues
> 
>
> Key: HDFS-9890
> URL: https://issues.apache.org/jira/browse/HDFS-9890
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: James Clampffer
>Assignee: Xiaowei Zhu
> Attachments: HDFS-9890.HDFS-8707.000.patch, 
> HDFS-9890.HDFS-8707.001.patch, HDFS-9890.HDFS-8707.002.patch, 
> HDFS-9890.HDFS-8707.003.patch, HDFS-9890.HDFS-8707.004.patch, 
> HDFS-9890.HDFS-8707.005.patch, HDFS-9890.HDFS-8707.006.patch, 
> HDFS-9890.HDFS-8707.007.patch, HDFS-9890.HDFS-8707.008.patch, 
> HDFS-9890.HDFS-8707.009.patch, HDFS-9890.HDFS-8707.010.patch, 
> HDFS-9890.HDFS-8707.011.patch, HDFS-9890.HDFS-8707.012.patch, 
> HDFS-9890.HDFS-8707.012.patch, HDFS-9890.HDFS-8707.013.patch, 
> HDFS-9890.HDFS-8707.013.patch, HDFS-9890.HDFS-8707.014.patch, 
> HDFS-9890.HDFS-8707.015.patch, HDFS-9890.HDFS-8707.016.patch, 
> HDFS-9890.HDFS-8707.016.patch, hs_err_pid26832.log, hs_err_pid4944.log
>
>
> I propose adding a test suite to simulate various network issues/failures in 
> order to get good test coverage on some of the retry paths that aren't easy 
> to hit in mock unit tests.
> At the moment the only things that hit the retry paths are the gmock unit 
> tests.  The gmock are only as good as their mock implementations which do a 
> great job of simulating protocol correctness but not more complex 
> interactions.  They also can't really simulate the types of lock contention 
> and subtle memory stomps that show up while doing hundreds or thousands of 
> concurrent reads.   We should add a new minidfscluster test that focuses on 
> heavy read/seek load and then randomly convert error codes returned by 
> network functions into errors.
> List of things to simulate(while heavily loaded), roughly in order of how 
> badly I think they need to be tested at the moment:
> -Rpc connection disconnect
> -Rpc connection slowed down enough to cause a timeout and trigger retry
> -DN connection disconnect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized

2016-07-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated HDFS-10617:
---
Status: Patch Available  (was: Open)

> PendingReconstructionBlocks.size() should be synchronized
> -
>
> Key: HDFS-10617
> URL: https://issues.apache.org/jira/browse/HDFS-10617
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, 
> HDSF-10617-b2.001.patch
>
>
> PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) 
> is a HashMap, which is not a thread-safe data structure. Therefore, the 
> size() function should be synchronized just like the rest of the member 
> functions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized

2016-07-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated HDFS-10617:
---
Attachment: HDFS-10617.002.patch

Attaching updated trunk patch. Eclipse inserted tabs instead of spaces into the 
first one. 

> PendingReconstructionBlocks.size() should be synchronized
> -
>
> Key: HDFS-10617
> URL: https://issues.apache.org/jira/browse/HDFS-10617
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10617.001.patch, HDFS-10617.002.patch, 
> HDSF-10617-b2.001.patch
>
>
> PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) 
> is a HashMap, which is not a thread-safe data structure. Therefore, the 
> size() function should be synchronized just like the rest of the member 
> functions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized

2016-07-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated HDFS-10617:
---
Attachment: HDSF-10617-b2.001.patch

Attaching branch-2 patch

> PendingReconstructionBlocks.size() should be synchronized
> -
>
> Key: HDFS-10617
> URL: https://issues.apache.org/jira/browse/HDFS-10617
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10617.001.patch, HDSF-10617-b2.001.patch
>
>
> PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) 
> is a HashMap, which is not a thread-safe data structure. Therefore, the 
> size() function should be synchronized just like the rest of the member 
> functions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized

2016-07-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated HDFS-10617:
---
Attachment: HDFS-10617.001.patch

Attaching patch for trunk

> PendingReconstructionBlocks.size() should be synchronized
> -
>
> Key: HDFS-10617
> URL: https://issues.apache.org/jira/browse/HDFS-10617
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Eric Badger
>Assignee: Eric Badger
> Attachments: HDFS-10617.001.patch
>
>
> PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) 
> is a HashMap, which is not a thread-safe data structure. Therefore, the 
> size() function should be synchronized just like the rest of the member 
> functions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10617) PendingReconstructionBlocks.size() should be synchronized

2016-07-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated HDFS-10617:
---
Description: PendingReconstructionBlocks (PendingReplicationBlocks in 
branch-2 and below) is a HashMap, which is not a thread-safe data structure. 
Therefore, the size() function should be synchronized just like the rest of the 
member functions.   (was: pendingReconstructions (pendingReplicationBlocks in 
branch-2 and below) is a HashMap, which is not a thread-safe data structure. 
Therefore, the size() function should be synchronized just like the rest of the 
member functions. )
Summary: PendingReconstructionBlocks.size() should be synchronized  
(was: PendingReconstructions.size() should be synchronized)

> PendingReconstructionBlocks.size() should be synchronized
> -
>
> Key: HDFS-10617
> URL: https://issues.apache.org/jira/browse/HDFS-10617
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> PendingReconstructionBlocks (PendingReplicationBlocks in branch-2 and below) 
> is a HashMap, which is not a thread-safe data structure. Therefore, the 
> size() function should be synchronized just like the rest of the member 
> functions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10425) Clean up NNStorage and TestSaveNamespace

2016-07-13 Thread Andras Bokor (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor updated HDFS-10425:

Attachment: HDFS-10425.02.patch

Patch 02. First one was no longer applicable.

> Clean up NNStorage and TestSaveNamespace
> 
>
> Key: HDFS-10425
> URL: https://issues.apache.org/jira/browse/HDFS-10425
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Andras Bokor
>Assignee: Andras Bokor
>Priority: Trivial
> Attachments: HDFS-10425.01.patch, HDFS-10425.02.patch
>
>
> Since I was working with NNStorage and TestSaveNamespace classes it is good 
> time take care with IDE and checkstyle warnings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10617) PendingReconstructions.size() should be synchronized

2016-07-13 Thread Eric Badger (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated HDFS-10617:
---
Description: pendingReconstructions (pendingReplicationBlocks in branch-2 
and below) is a HashMap, which is not a thread-safe data structure. Therefore, 
the size() function should be synchronized just like the rest of the member 
functions.   (was: pendingReplications is a HashMap, which is not a thread-safe 
data structure. Therefore, the size() function should be synchronized just like 
the rest of the member functions. )
Summary: PendingReconstructions.size() should be synchronized  (was: 
PendingReplicationBlocks.size() should be synchronized)

> PendingReconstructions.size() should be synchronized
> 
>
> Key: HDFS-10617
> URL: https://issues.apache.org/jira/browse/HDFS-10617
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.22.0
>Reporter: Eric Badger
>Assignee: Eric Badger
>
> pendingReconstructions (pendingReplicationBlocks in branch-2 and below) is a 
> HashMap, which is not a thread-safe data structure. Therefore, the size() 
> function should be synchronized just like the rest of the member functions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-10617) PendingReplicationBlocks.size() should be synchronized

2016-07-13 Thread Eric Badger (JIRA)

Eric Badger created HDFS-10617:
--

 Summary: PendingReplicationBlocks.size() should be synchronized
 Key: HDFS-10617
 URL: https://issues.apache.org/jira/browse/HDFS-10617
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Eric Badger
Assignee: Eric Badger


pendingReplications is a HashMap, which is not a thread-safe data structure. 
Therefore, the size() function should be synchronized just like the rest of the 
member functions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-9809) Abstract implementation-specific details from the datanode

2016-07-13 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375111#comment-15375111
 ] 

Ewan Higgs edited comment on HDFS-9809 at 7/13/16 2:33 PM:
---

Hi,
In Storage.java, I think a good deal of copyFileBuffered can be replaced with 
{{org.apache.commons.IOUtils.copy}} (or {{copyLarge}}). {{fis}} could also be 
renamed {{fin}} to reflect the opposite of {{fout}}.

Across a lot of these files, loggers are using '\+' for string concatenation 
rather than using sl4j templating (\{\}).   There is an ongoing effort 
(HADOOP-9864, HDFS-8971, etc) to get these fixed up so instead of adding more 
LOG statements with '\+', try to take the opportunity to clean it up as you go 
(though this just adds to this already rather large patch).


was (Author: ehiggs):
Hi,
In Storage.java, I think a good deal of copyFileBuffered can be replaced with 
{{org.apache.commons.IOUtils.copy}} (or {{copyLarge}}). {fis}] could also be 
renamed {{fin}} to reflect the opposite of {{fout}}.

Across a lot of these files, loggers are using '\+' for string concatenation 
rather than using sl4j templating (\{\}).   There is an ongoing effort 
(HADOOP-9864, HDFS-8971, etc) to get these fixed up so instead of adding more 
LOG statements with '\+', try to take the opportunity to clean it up as you go 
(though this just adds to this already rather large patch).

> Abstract implementation-specific details from the datanode
> --
>
> Key: HDFS-9809
> URL: https://issues.apache.org/jira/browse/HDFS-9809
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode, fs
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-9809.001.patch, HDFS-9809.002.patch, 
> HDFS-9809.003.patch, HDFS-9809.004.patch
>
>
> Multiple parts of the Datanode (FsVolumeSpi, ReplicaInfo, FSVolumeImpl etc.) 
> implicitly assume that blocks are stored in java.io.File(s) and that volumes 
> are divided into directories. We propose to abstract these details, which 
> would help in supporting other storages. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-9809) Abstract implementation-specific details from the datanode

2016-07-13 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375111#comment-15375111
 ] 

Ewan Higgs commented on HDFS-9809:
--

Hi,
In Storage.java, I think a good deal of copyFileBuffered can be replaced with 
{{org.apache.commons.IOUtils.copy}} (or {{copyLarge}}). {fis} could also be 
renamed {{fin]} to reflect the opposite of {{fout}}.

Across a lot of these files, loggers are using '\+' for string concatenation 
rather than using sl4j templating (\{\}).   There is an ongoing effort 
(HADOOP-9864, HDFS-8971, etc) to get these fixed up so instead of adding more 
LOG statements with '\+', try to take the opportunity to clean it up as you go 
(though this just adds to this already rather large patch).

> Abstract implementation-specific details from the datanode
> --
>
> Key: HDFS-9809
> URL: https://issues.apache.org/jira/browse/HDFS-9809
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode, fs
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-9809.001.patch, HDFS-9809.002.patch, 
> HDFS-9809.003.patch, HDFS-9809.004.patch
>
>
> Multiple parts of the Datanode (FsVolumeSpi, ReplicaInfo, FSVolumeImpl etc.) 
> implicitly assume that blocks are stored in java.io.File(s) and that volumes 
> are divided into directories. We propose to abstract these details, which 
> would help in supporting other storages. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-9809) Abstract implementation-specific details from the datanode

2016-07-13 Thread Ewan Higgs (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-9809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15375111#comment-15375111
 ] 

Ewan Higgs edited comment on HDFS-9809 at 7/13/16 2:32 PM:
---

Hi,
In Storage.java, I think a good deal of copyFileBuffered can be replaced with 
{{org.apache.commons.IOUtils.copy}} (or {{copyLarge}}). {fis}] could also be 
renamed {{fin}} to reflect the opposite of {{fout}}.

Across a lot of these files, loggers are using '\+' for string concatenation 
rather than using sl4j templating (\{\}).   There is an ongoing effort 
(HADOOP-9864, HDFS-8971, etc) to get these fixed up so instead of adding more 
LOG statements with '\+', try to take the opportunity to clean it up as you go 
(though this just adds to this already rather large patch).


was (Author: ehiggs):
Hi,
In Storage.java, I think a good deal of copyFileBuffered can be replaced with 
{{org.apache.commons.IOUtils.copy}} (or {{copyLarge}}). {fis} could also be 
renamed {{fin]} to reflect the opposite of {{fout}}.

Across a lot of these files, loggers are using '\+' for string concatenation 
rather than using sl4j templating (\{\}).   There is an ongoing effort 
(HADOOP-9864, HDFS-8971, etc) to get these fixed up so instead of adding more 
LOG statements with '\+', try to take the opportunity to clean it up as you go 
(though this just adds to this already rather large patch).

> Abstract implementation-specific details from the datanode
> --
>
> Key: HDFS-9809
> URL: https://issues.apache.org/jira/browse/HDFS-9809
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: datanode, fs
>Reporter: Virajith Jalaparti
>Assignee: Virajith Jalaparti
> Attachments: HDFS-9809.001.patch, HDFS-9809.002.patch, 
> HDFS-9809.003.patch, HDFS-9809.004.patch
>
>
> Multiple parts of the Datanode (FsVolumeSpi, ReplicaInfo, FSVolumeImpl etc.) 
> implicitly assume that blocks are stored in java.io.File(s) and that volumes 
> are divided into directories. We propose to abstract these details, which 
> would help in supporting other storages. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10606) TrashPolicyDefault supports time of auto clean up can configured

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374540#comment-15374540
 ] 

Hadoop QA commented on HDFS-10606:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
22s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  9m 
11s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
24s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
14s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
49s{color} | {color:green} branch-2.7 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
17s{color} | {color:green} branch-2.7 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
40s{color} | {color:red} hadoop-common-project/hadoop-common in branch-2.7 has 
3 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} branch-2.7 passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} branch-2.7 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
22s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  6m 
15s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  6m 
15s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 23s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch 
generated 8 new + 129 unchanged - 0 fixed = 137 total (was 129) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 2545 line(s) that end in whitespace. Use 
git apply --whitespace=fix. {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  1m 
14s{color} | {color:red} The patch 70 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
0s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed with JDK v1.8.0_91 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 20m 34s{color} 
| {color:red} hadoop-common in the patch failed with JDK v1.7.0_101. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
27s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_91 Failed junit tests | hadoop.http.TestSSLHttpServer |
| JDK v1.8.0_91 Timed out junit tests | 
org.apache.hadoop.conf.TestConfiguration |
| JDK v1.7.0_101 Failed junit tests | hadoop.ha.TestZKFailoverController |
| JDK v1.7.0_101 Timed out junit tests |

[jira] [Commented] (HDFS-3051) A zero-copy ScatterGatherRead api from FSDataInputStream

2016-07-13 Thread Ravikumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374535#comment-15374535
 ] 

Ravikumar commented on HDFS-3051:
-

How about returning the MappedByteBuffers of all blocks for a file in local. If 
there are non-local blocks, this method can simply return empty.

public List readFullyScatterGatherLocal(EnumSet options)
throws IOException {
return ((PositionedReadable)in).readFullyScatterGather(options);
}

A quick sample-impl can be like

public List readFullyScatterGatherLocal(EnumSet) throws 
IOException
{
  List blockRange = getBlockRange(0, getFileLength());
  if(!allBlocksInLocal(blockRange)) 
 {
return;
 }
 List retval = new LinkedList();
 for(LocatedBlock blk:blockRange) 
 {
   blkReader = fetchBlockReader(blk, localDNAddrPair);
  ClientMmap mmap = blkReader.getClientMmap(readOptions);
  mmap.setunmap(false); //Instruction to cache-eviction to avoid unmapping 
this. Slots, streams & all other resources will be closed
  result.add(mmap.getMappedByteBuffer());
  closeBlockReader(blkReader);
}
return retval
}

Apps opening InputStreams only once (Hbase??) can call this method & use the 
zero-copy buffers for reads, if file is local.  If not available, they can fall 
back to regular DFSInputStream. Reads can eliminate sync overheads & get same 
perf as a local filesystem.

But I don't know if "leaking" MappedByteBuffers to calling code can have nasty 
side-effects. 






> A zero-copy ScatterGatherRead api from FSDataInputStream
> 
>
> Key: HDFS-3051
> URL: https://issues.apache.org/jira/browse/HDFS-3051
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, performance
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
>
> It will be nice if we can get a new API from FSDtaInputStream that allows for 
> zero-copy read for hdfs readers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374536#comment-15374536
 ] 

Hadoop QA commented on HDFS-10544:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 135 unchanged - 0 fixed = 137 total (was 135) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 59m 
52s{color} | {color:green} hadoop-hdfs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 80m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817602/HDFS-10544.05.patch |
| JIRA Issue | HDFS-10544 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 2d63fcb45c0a 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 06c56ff |
| Default Java | 1.8.0_91 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16039/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16039/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16039/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1
>
>

[jira] [Commented] (HDFS-10590) Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures

2016-07-13 Thread Rakesh R (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374519#comment-15374519
 ] 

Rakesh R commented on HDFS-10590:
-

Thank you [~umamaheswararao] for reviewing and committing the patch.

> Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures
> 
>
> Key: HDFS-10590
> URL: https://issues.apache.org/jira/browse/HDFS-10590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10590-00.patch
>
>
> This jira is to fix the test case failure. Please see the below stacktrace.
> Reference : 
> [Build_15968|https://builds.apache.org/job/PreCommit-HDFS-Build/15968/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestReconstructStripedBlocks/testCountLiveReplicas/]
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestReconstructStripedBlocks.testCountLiveReplicas(TestReconstructStripedBlocks.java:324)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10590) Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures

2016-07-13 Thread Uma Maheswara Rao G (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Uma Maheswara Rao G updated HDFS-10590:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha1
   Status: Resolved  (was: Patch Available)

I have committed this to trunk. Thanks Rakesh.

> Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures
> 
>
> Key: HDFS-10590
> URL: https://issues.apache.org/jira/browse/HDFS-10590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10590-00.patch
>
>
> This jira is to fix the test case failure. Please see the below stacktrace.
> Reference : 
> [Build_15968|https://builds.apache.org/job/PreCommit-HDFS-Build/15968/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestReconstructStripedBlocks/testCountLiveReplicas/]
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestReconstructStripedBlocks.testCountLiveReplicas(TestReconstructStripedBlocks.java:324)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10590) Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures

2016-07-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374516#comment-15374516
 ] 

Hudson commented on HDFS-10590:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #10087 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10087/])
HDFS-10590: Fix TestReconstructStripedBlocks.testCountLiveReplicas test 
(uma.gangumalla: rev 438b7c5935f4314fd37916aee4369e67ec2887f8)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestReconstructStripedBlocks.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/StripedFileTestUtil.java


> Fix TestReconstructStripedBlocks.testCountLiveReplicas test failures
> 
>
> Key: HDFS-10590
> URL: https://issues.apache.org/jira/browse/HDFS-10590
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10590-00.patch
>
>
> This jira is to fix the test case failure. Please see the below stacktrace.
> Reference : 
> [Build_15968|https://builds.apache.org/job/PreCommit-HDFS-Build/15968/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestReconstructStripedBlocks/testCountLiveReplicas/]
> {code}
> java.lang.AssertionError: null
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestReconstructStripedBlocks.testCountLiveReplicas(TestReconstructStripedBlocks.java:324)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10587) Incorrect offset/length calculation in pipeline recovery causes block corruption

2016-07-13 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374480#comment-15374480
 ] 

Yongjun Zhang commented on HDFS-10587:
--

About the visibleLength, I saw 
{code}
In ReplicaBeingWritten.java
  @Override
  public long getVisibleLength() {
return getBytesAcked(); // all acked bytes are visible
  }
{code}
which means different replicas may have different visibleLength, because 
BytesAcked at different DataNodes maybe different.

My earlier effort was to claim that using different visibleLength at the 
BlockReceiver than the BlockSender side is wrong. Based on the above code, it 
might be ok to claim the visibleLength as the received data length at the 
destination side of blockTransfer (better to get confirmation though).

So, we need to understand, how the corruption really happened, and where in the 
block data: Did it happen when we receive this chunk of data, or when we 
receive new data after reconstructing the pipeline? Because based on my 
analysis so far, the skipping of the bytes on disk (mentioned in the following 
statement) is necessary since the data is not garbage (assuming the data at the 
Sender side is good).
{quote}
(8) When new data was appended to the destination, it skipped the bytes already 
on disk. Therefore, whatever was written as garbage was not replaced.
{quote}

One possibility is that the checksum handling there is not correct in a corner 
situation. 

If we have a testcase to replicate the issue, we need to look at both the 
source side data and destination side data, to see whether it's real data 
corruption, or checksum miscalculation. If there is corruption, where exactly 
the corruption is.




> Incorrect offset/length calculation in pipeline recovery causes block 
> corruption
> 
>
> Key: HDFS-10587
> URL: https://issues.apache.org/jira/browse/HDFS-10587
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
> Attachments: HDFS-10587.001.patch
>
>
> We found incorrect offset and length calculation in pipeline recovery may 
> cause block corruption and results in missing blocks under a very unfortunate 
> scenario. 
> (1) A client established pipeline and started writing data to the pipeline.
> (2) One of the data node in the pipeline restarted, closing the socket, and 
> some written data were unacknowledged.
> (3) Client replaced the failed data node with a new one, initiating block 
> transfer to copy existing data in the block to the new datanode.
> (4) The block is transferred to the new node. Crucially, the entire block, 
> including the unacknowledged data, was transferred.
> (5) The last chunk (512 bytes) was not a full chunk, but the destination 
> still reserved the whole chunk in its buffer, and wrote the entire buffer to 
> disk, therefore some written data is garbage.
> (6) When the transfer was done, the destination data node converted the 
> replica from temporary to rbw, which made its visible length as the length of 
> bytes on disk. That is to say, it thought whatever was transferred was 
> acknowledged. However, the visible length of the replica is different (round 
> up to the next multiple of 512) than the source of transfer. [1]
> (7) Client then truncated the block in the attempt to remove unacknowledged 
> data. However, because the visible length is equivalent of the bytes on disk, 
> it did not truncate unacknowledged data.
> (8) When new data was appended to the destination, it skipped the bytes 
> already on disk. Therefore, whatever was written as garbage was not replaced.
> (9) the volume scanner detected corrupt replica, but due to HDFS-10512, it 
> wouldn’t tell NameNode to mark the replica as corrupt, so the client 
> continued to form a pipeline using the corrupt replica.
> (10) Finally the DN that had the only healthy replica was restarted. NameNode 
> then update the pipeline to only contain the corrupt replica.
> (11) Client continue to write to the corrupt replica, because neither client 
> nor the data node itself knows the replica is corrupt. When the restarted 
> datanodes comes back, their replica are stale, despite they are not corrupt. 
> Therefore, none of the replica is good and up to date.
> The sequence of events was reconstructed based on DataNode/NameNode log and 
> my understanding of code.
> Incidentally, we have observed the same sequence of events on two independent 
> clusters.
> [1]
> The sender has the replica as follows:
> 2016-04-15 22:03:05,066 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: 
> Recovering ReplicaBeingWritten, blk_1556997324_1100153495099, RBW
>   getNumBytes() = 41381376
>   getBytesOnDisk()  = 41381376
>

[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Zhe Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374470#comment-15374470
 ] 

Zhe Zhang commented on HDFS-10544:
--

Committed v5 patch to trunk, branch-2, and branch-2.8. I'm working on resolving 
branch-2.7 conflicts.

> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1
>
> Attachments: HDFS-10544.00.patch, HDFS-10544.01.patch, 
> HDFS-10544.02.patch, HDFS-10544.03.patch, HDFS-10544.04.patch, 
> HDFS-10544.05.patch
>
>
> Right now {{Balancer}} gets the NN URIs through 
> {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. 
> If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to 
> start.
> I think the bug is at {{DFSUtil#getNameServiceUris}}:
> {code}
> for (String nsId : getNameServiceIds(conf)) {
>   if (HAUtil.isHAEnabled(conf, nsId)) {
> // Add the logical URI of the nameservice.
> try {
>   ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId));
> {code}
> Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has 
> {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to 
> resolve the physical URI for this nsId.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10544:
-
Fix Version/s: 3.0.0-alpha1
   2.9.0
   2.8.0

> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha1
>
> Attachments: HDFS-10544.00.patch, HDFS-10544.01.patch, 
> HDFS-10544.02.patch, HDFS-10544.03.patch, HDFS-10544.04.patch, 
> HDFS-10544.05.patch
>
>
> Right now {{Balancer}} gets the NN URIs through 
> {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. 
> If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to 
> start.
> I think the bug is at {{DFSUtil#getNameServiceUris}}:
> {code}
> for (String nsId : getNameServiceIds(conf)) {
>   if (HAUtil.isHAEnabled(conf, nsId)) {
> // Add the logical URI of the nameservice.
> try {
>   ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId));
> {code}
> Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has 
> {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to 
> resolve the physical URI for this nsId.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15374461#comment-15374461
 ] 

Hudson commented on HDFS-10544:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #10086 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10086/])
HDFS-10544. Balancer doesn't work with IPFailoverProxyProvider. (zhz: rev 
087290e6b1cb1082646d966b65494082712ebe3e)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSUtil.java


> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-10544.00.patch, HDFS-10544.01.patch, 
> HDFS-10544.02.patch, HDFS-10544.03.patch, HDFS-10544.04.patch, 
> HDFS-10544.05.patch
>
>
> Right now {{Balancer}} gets the NN URIs through 
> {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. 
> If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to 
> start.
> I think the bug is at {{DFSUtil#getNameServiceUris}}:
> {code}
> for (String nsId : getNameServiceIds(conf)) {
>   if (HAUtil.isHAEnabled(conf, nsId)) {
> // Add the logical URI of the nameservice.
> try {
>   ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId));
> {code}
> Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has 
> {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to 
> resolve the physical URI for this nsId.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10606) TrashPolicyDefault supports time of auto clean up can configured

2016-07-13 Thread He Xiaoqiao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-10606:
---
Affects Version/s: (was: 2.7.1)
   2.7.0
   Status: Patch Available  (was: Open)

> TrashPolicyDefault supports time of auto clean up can configured
> 
>
> Key: HDFS-10606
> URL: https://issues.apache.org/jira/browse/HDFS-10606
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.0
>Reporter: He Xiaoqiao
> Attachments: HDFS-10606-branch-2.7.001.patch
>
>
> TrashPolicyDefault clean up Trash based on 
> [UTC|http://www.worldtimeserver.com/current_time_in_UTC.aspx] currently and 
> the time of cleaning up is 00:00 UTC. when there are large amount of trash 
> data should be auto-clean, it will block NN for a long time since Global 
> Lock, In the most serious situations it may lead some cron job submit 
> failure. if add configuration about time of cleaning up, it will avoid impact 
> on this cron jobs at that default time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10606) TrashPolicyDefault supports time of auto clean up can configured

2016-07-13 Thread He Xiaoqiao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Xiaoqiao updated HDFS-10606:
---
Attachment: HDFS-10606-branch-2.7.001.patch

submit patch for branch-2.7

> TrashPolicyDefault supports time of auto clean up can configured
> 
>
> Key: HDFS-10606
> URL: https://issues.apache.org/jira/browse/HDFS-10606
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 2.7.1
>Reporter: He Xiaoqiao
> Attachments: HDFS-10606-branch-2.7.001.patch
>
>
> TrashPolicyDefault clean up Trash based on 
> [UTC|http://www.worldtimeserver.com/current_time_in_UTC.aspx] currently and 
> the time of cleaning up is 00:00 UTC. when there are large amount of trash 
> data should be auto-clean, it will block NN for a long time since Global 
> Lock, In the most serious situations it may lead some cron job submit 
> failure. if add configuration about time of cleaning up, it will avoid impact 
> on this cron jobs at that default time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10544) Balancer doesn't work with IPFailoverProxyProvider

2016-07-13 Thread Zhe Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-10544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-10544:
-
Attachment: HDFS-10544.05.patch

Thanks [~shv] for the review! Attaching v5 patch which fixes 2 checkstyle 
issues in v4 (lines in {{TestDFSUtil}} too long). The other 2 checkstyle issues 
are inherent with the original code style.

The reported test failures are unrelated and cannot be reproduced locally.

I'll commit v5 patch shortly.

> Balancer doesn't work with IPFailoverProxyProvider
> --
>
> Key: HDFS-10544
> URL: https://issues.apache.org/jira/browse/HDFS-10544
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Attachments: HDFS-10544.00.patch, HDFS-10544.01.patch, 
> HDFS-10544.02.patch, HDFS-10544.03.patch, HDFS-10544.04.patch, 
> HDFS-10544.05.patch
>
>
> Right now {{Balancer}} gets the NN URIs through 
> {{DFSUtil#getNameServiceUris}}, which returns logical URIs in HA is enabled. 
> If {{IPFailoverProxyProvider}} is used, {{Balancer}} will not be able to 
> start.
> I think the bug is at {{DFSUtil#getNameServiceUris}}:
> {code}
> for (String nsId : getNameServiceIds(conf)) {
>   if (HAUtil.isHAEnabled(conf, nsId)) {
> // Add the logical URI of the nameservice.
> try {
>   ret.add(new URI(HdfsConstants.HDFS_URI_SCHEME + "://" + nsId));
> {code}
> Then {{if}} clause should also consider if the {{FailoverProxyProvider}} has 
> {{useLogicalURI}} enabled. If not, {{getNameServiceUris}} should try to 
> resolve the physical URI for this nsId.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

93 matches

Mail list logo