[jira] [Commented] (HDFS-10645) Make block report size as a metric and add this metric to datanode web ui

2016-08-03 Thread Yuanbo Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407034#comment-15407034
 ] 

Yuanbo Liu commented on HDFS-10645:
---

I don't think these failures are related to my change. Resume the progress to 
get new result.

> Make block report size as a metric and add this metric to datanode web ui
> -
>
> Key: HDFS-10645
> URL: https://issues.apache.org/jira/browse/HDFS-10645
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, ui
>Reporter: Yuanbo Liu
>Assignee: Yuanbo Liu
> Attachments: HDFS-10645.001.patch, HDFS-10645.002.patch, 
> HDFS-10645.003.patch, HDFS-10645.004.patch, HDFS-10645.005.patch, 
> HDFS-10645.006.patch, HDFS-10645.007.patch, Selection_047.png, 
> Selection_048.png
>
>
> Record block report size as a metric and show it on datanode UI. It's 
> important for administrators to know the bottleneck of  block report, and the 
> metric is also a good tuning metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10645) Make block report size as a metric and add this metric to datanode web ui

2016-08-03 Thread Yuanbo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu updated HDFS-10645:
--
Status: Patch Available  (was: In Progress)

> Make block report size as a metric and add this metric to datanode web ui
> -
>
> Key: HDFS-10645
> URL: https://issues.apache.org/jira/browse/HDFS-10645
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, ui
>Reporter: Yuanbo Liu
>Assignee: Yuanbo Liu
> Attachments: HDFS-10645.001.patch, HDFS-10645.002.patch, 
> HDFS-10645.003.patch, HDFS-10645.004.patch, HDFS-10645.005.patch, 
> HDFS-10645.006.patch, HDFS-10645.007.patch, Selection_047.png, 
> Selection_048.png
>
>
> Record block report size as a metric and show it on datanode UI. It's 
> important for administrators to know the bottleneck of  block report, and the 
> metric is also a good tuning metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10645) Make block report size as a metric and add this metric to datanode web ui

2016-08-03 Thread Yuanbo Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuanbo Liu updated HDFS-10645:
--
Status: In Progress  (was: Patch Available)

> Make block report size as a metric and add this metric to datanode web ui
> -
>
> Key: HDFS-10645
> URL: https://issues.apache.org/jira/browse/HDFS-10645
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, ui
>Reporter: Yuanbo Liu
>Assignee: Yuanbo Liu
> Attachments: HDFS-10645.001.patch, HDFS-10645.002.patch, 
> HDFS-10645.003.patch, HDFS-10645.004.patch, HDFS-10645.005.patch, 
> HDFS-10645.006.patch, HDFS-10645.007.patch, Selection_047.png, 
> Selection_048.png
>
>
> Record block report size as a metric and show it on datanode UI. It's 
> important for administrators to know the bottleneck of  block report, and the 
> metric is also a good tuning metric.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10169) TestEditLog.testBatchedSyncWithClosedLogs with useAsyncEditLog sometimes fails

2016-08-03 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407030#comment-15407030
 ] 

Rakesh R commented on HDFS-10169:
-

This test is failing frequently in Jenkins build, could someone help in 
reviewing this jira. Thanks!

https://builds.apache.org/job/PreCommit-HDFS-Build/16309/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestEditLog/testBatchedSyncWithClosedLogs_1_/

https://builds.apache.org/job/PreCommit-HDFS-Build/16304/testReport/junit/org.apache.hadoop.hdfs.server.namenode/TestEditLog/testBatchedSyncWithClosedLogs_1_/

> TestEditLog.testBatchedSyncWithClosedLogs with useAsyncEditLog sometimes fails
> --
>
> Key: HDFS-10169
> URL: https://issues.apache.org/jira/browse/HDFS-10169
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Kihwal Lee
>Assignee: Rakesh R
> Attachments: HDFS-10169-00.patch, HDFS-10169-01.patch
>
>
> This failure has been seen multiple precomit builds recently.
> {noformat}
> testBatchedSyncWithClosedLogs[1](org.apache.hadoop.hdfs.server.namenode.TestEditLog)
>   Time elapsed: 0.377 sec  <<< FAILURE!
> java.lang.AssertionError: logging edit without syncing should do not affect 
> txid expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.namenode.TestEditLog.testBatchedSyncWithClosedLogs(TestEditLog.java:594)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10718) Prefer direct ByteBuffer in native RS encoder and decoder

2016-08-03 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-10718:
-
Release Note: Re-submitted the patch to trigger Jenkins building.
  Status: Patch Available  (was: Open)

> Prefer direct ByteBuffer in native RS encoder and decoder
> -
>
> Key: HDFS-10718
> URL: https://issues.apache.org/jira/browse/HDFS-10718
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: HDFS-10718-v1.patch
>
>
> Enable ByteBuffer preference in Native Reed-Solomon encoder and encoder, to 
> improve the overall performance. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10690) Optimize insertion/removal of replica in ShortCircuitCache.java

2016-08-03 Thread Fenghua Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407012#comment-15407012
 ] 

Fenghua Hu commented on HDFS-10690:
---

Xiaoyu,

[~xiaoyuyao] I tried to replace TreeMap with linkedHashMap, but found 
LinkedHashMap lacks of function "ceilingEntry" or similar alternative, which is 
key to implement LRU-based replacement algorithm. LinkedHashMap also can't 
provide getYoungest or getEldest or similar functions. That's to say, if we 
want to use LinkedHashMap, we actually need to rewrite it. Any comments? Thanks.




> Optimize insertion/removal of replica in ShortCircuitCache.java
> ---
>
> Key: HDFS-10690
> URL: https://issues.apache.org/jira/browse/HDFS-10690
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 3.0.0-alpha2
>Reporter: Fenghua Hu
>Assignee: Fenghua Hu
> Attachments: HDFS-10690.001.patch, HDFS-10690.002.patch
>
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> Currently in ShortCircuitCache, two TreeMap objects are used to track the 
> cached replicas.
> private final TreeMap evictable = new TreeMap<>();
> private final TreeMap evictableMmapped = new 
> TreeMap<>();
> TreeMap employs Red-Black tree for sorting. This isn't an issue when using 
> traditional HDD. But when using high-performance SSD/PCIe Flash, the cost 
> inserting/removing an entry  becomes considerable.
> To mitigate it, we designed a new list-based for replica tracking.
> The list is a double-linked FIFO. FIFO is time-based, thus insertion is a 
> very low cost operation. On the other hand, list is not lookup-friendly. To 
> address this issue, we introduce two references into ShortCircuitReplica 
> object.
> ShortCircuitReplica next = null;
> ShortCircuitReplica prev = null;
> In this way, lookup is not needed when removing a replica from the list. We 
> only need to modify its predecessor's and successor's references in the lists.
> Our tests showed up to 15-50% performance improvement when using PCIe flash 
> as storage media.
> The original patch is against 2.6.4, now I am porting to Hadoop trunk, and 
> patch will be posted soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10718) Prefer direct ByteBuffer in native RS encoder and decoder

2016-08-03 Thread Kai Zheng (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Zheng updated HDFS-10718:
-
Status: Open  (was: Patch Available)

> Prefer direct ByteBuffer in native RS encoder and decoder
> -
>
> Key: HDFS-10718
> URL: https://issues.apache.org/jira/browse/HDFS-10718
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: HDFS-10718-v1.patch
>
>
> Enable ByteBuffer preference in Native Reed-Solomon encoder and encoder, to 
> improve the overall performance. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10434) Fix intermittent test failure of TestDataNodeErasureCodingMetrics

2016-08-03 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15407001#comment-15407001
 ] 

Rakesh R commented on HDFS-10434:
-

Thanks [~yzhangal], [~liuml07] for pointing out this. I'll dig more into this 
failure test and get back to you.

> Fix intermittent test failure of TestDataNodeErasureCodingMetrics
> -
>
> Key: HDFS-10434
> URL: https://issues.apache.org/jira/browse/HDFS-10434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10434-00.patch, HDFS-10434-01.patch
>
>
> This jira is to fix the test case failure.
> Reference : 
> [Build15485_TestDataNodeErasureCodingMetrics_testEcTasks|https://builds.apache.org/job/PreCommit-HDFS-Build/15485/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testEcTasks/]
> {code}
> Error Message
> Bad value for metric EcReconstructionTasks expected:<1> but was:<0>
> Stacktrace
> java.lang.AssertionError: Bad value for metric EcReconstructionTasks 
> expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.test.MetricsAsserts.assertCounter(MetricsAsserts.java:228)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testEcTasks(TestDataNodeErasureCodingMetrics.java:92)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10569) A bug causes OutOfIndex error in BlockListAsLongs

2016-08-03 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406986#comment-15406986
 ] 

Weiwei Yang commented on HDFS-10569:


My pleasure thanks [~kihwal]

> A bug causes OutOfIndex error in BlockListAsLongs
> -
>
> Key: HDFS-10569
> URL: https://issues.apache.org/jira/browse/HDFS-10569
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Minor
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-10569.001.patch, HDFS-10569.002.patch, 
> HDFS-10569.003.patch, HDFS-10569.004.patch
>
>
> An obvious bug in LongsDecoder.getBlockListAsLongs(), the size of var *longs* 
> is +2 to the size of *values*, but the for-loop accesses *values* using 
> *longs* index. This will cause OutOfIndex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-10715) NPE when applying AvailableSpaceBlockPlacementPolicy

2016-08-03 Thread Guangbin Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406965#comment-15406965
 ] 

Guangbin Zhu edited comment on HDFS-10715 at 8/4/16 1:43 AM:
-

Many thanks  [~kihwal]. I found one failed test 
{code:java}hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics{code}, 
but it is unrelated with this issue.

Should I commit this patch ?


was (Author: zhuguangbin86):
Many thanks  Kihawal. I found one failed test 
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics, but it is 
unrelated with this issue.

Should I commit this patch ?

> NPE when applying AvailableSpaceBlockPlacementPolicy
> 
>
> Key: HDFS-10715
> URL: https://issues.apache.org/jira/browse/HDFS-10715
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
> Environment: cdh5.8.0
>Reporter: Guangbin Zhu
>Assignee: Guangbin Zhu
> Attachments: HDFS-10715.001.patch, HDFS-10715.002.patch, 
> HDFS-10715.003.patch, HDFS-10715.004.patch
>
>
> As HDFS-8131 introduced an AvailableSpaceBlockPlacementPolicy, but In some 
> cases, it caused NPE. 
> Here are my namenode daemon logs : 
> 2016-08-02 13:05:03,271 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 13 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
> 10.132.89.79:14001 Call#56 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.compareDataNode(AvailableSpaceBlockPlacementPolicy.java:95)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.chooseDataNode(AvailableSpaceBlockPlacementPolicy.java:80)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:665)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:572)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:457)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:367)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:242)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:114)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:130)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1606)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3315)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:679)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:489)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> I reviewed the source code, and found the bug in method chooseDataNode. 
> clusterMap.chooseRandom may return null, which cannot compare using equals 
> a.equals(b) method.  
> Though this exception can be caught, and then retry another call. I think 
> this bug should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: 

[jira] [Commented] (HDFS-10715) NPE when applying AvailableSpaceBlockPlacementPolicy

2016-08-03 Thread Guangbin Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406965#comment-15406965
 ] 

Guangbin Zhu commented on HDFS-10715:
-

Many thanks  Kihawal. I found one failed test 
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics, but it is 
unrelated with this issue.

Should I commit this patch ?

> NPE when applying AvailableSpaceBlockPlacementPolicy
> 
>
> Key: HDFS-10715
> URL: https://issues.apache.org/jira/browse/HDFS-10715
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
> Environment: cdh5.8.0
>Reporter: Guangbin Zhu
>Assignee: Guangbin Zhu
> Attachments: HDFS-10715.001.patch, HDFS-10715.002.patch, 
> HDFS-10715.003.patch, HDFS-10715.004.patch
>
>
> As HDFS-8131 introduced an AvailableSpaceBlockPlacementPolicy, but In some 
> cases, it caused NPE. 
> Here are my namenode daemon logs : 
> 2016-08-02 13:05:03,271 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 13 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
> 10.132.89.79:14001 Call#56 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.compareDataNode(AvailableSpaceBlockPlacementPolicy.java:95)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.chooseDataNode(AvailableSpaceBlockPlacementPolicy.java:80)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:665)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:572)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:457)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:367)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:242)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:114)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:130)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1606)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3315)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:679)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:489)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> I reviewed the source code, and found the bug in method chooseDataNode. 
> clusterMap.chooseRandom may return null, which cannot compare using equals 
> a.equals(b) method.  
> Though this exception can be caught, and then retry another call. I think 
> this bug should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object

2016-08-03 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406669#comment-15406669
 ] 

Chen Liang commented on HDFS-10682:
---

Thanks to [~arpitagarwal] for the advise! I moved the separate lock class to 
hadoop common utils and filed HADOOP-13466 to keep track of it. Will later on 
update this JIRA to just use the lock class.

> Replace FsDatasetImpl object lock with a separate lock object
> -
>
> Key: HDFS-10682
> URL: https://issues.apache.org/jira/browse/HDFS-10682
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-10682.001.patch, HDFS-10682.002.patch, 
> HDFS-10682.003.patch, HDFS-10682.004.patch, HDFS-10682.005.patch, 
> HDFS-10682.006.patch, HDFS-10682.007.patch, HDFS-10682.008.patch, 
> HDFS-10682.009.patch
>
>
> This Jira proposes to replace the FsDatasetImpl object lock with a separate 
> lock object. Doing so will make it easier to measure lock statistics like 
> lock held time and warn about potential lock contention due to slow disk 
> operations.
> In the future we can also consider replacing the lock with a read-write lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10707) Replace org.apache.commons.io.Charsets with java.nio.charset.StandardCharsets

2016-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406657#comment-15406657
 ] 

Hadoop QA commented on HDFS-10707:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
36s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
45s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
33s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
11s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
45s{color} | {color:green} branch-2 passed {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  1m 
41s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs-client in branch-2 has 
7 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
26s{color} | {color:green} branch-2 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
13s{color} | {color:green} branch-2 passed with JDK v1.7.0_101 {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
8s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} hadoop-hdfs-project: The patch generated 0 new + 115 
unchanged - 2 fixed = 115 total (was 117) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green} the patch passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
58s{color} | {color:green} hadoop-hdfs-client in the patch passed with JDK 
v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 46m 
25s{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_101. 
{color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m 
26s{color} | {color:green} hadoop-hdfs-httpfs in the patch passed with JDK 
v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}144m 57s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || 

[jira] [Commented] (HDFS-10434) Fix intermittent test failure of TestDataNodeErasureCodingMetrics

2016-08-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406648#comment-15406648
 ] 

Kai Zheng commented on HDFS-10434:
--

Traveling this week, please expect delayed response. Thanks.


> Fix intermittent test failure of TestDataNodeErasureCodingMetrics
> -
>
> Key: HDFS-10434
> URL: https://issues.apache.org/jira/browse/HDFS-10434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10434-00.patch, HDFS-10434-01.patch
>
>
> This jira is to fix the test case failure.
> Reference : 
> [Build15485_TestDataNodeErasureCodingMetrics_testEcTasks|https://builds.apache.org/job/PreCommit-HDFS-Build/15485/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testEcTasks/]
> {code}
> Error Message
> Bad value for metric EcReconstructionTasks expected:<1> but was:<0>
> Stacktrace
> java.lang.AssertionError: Bad value for metric EcReconstructionTasks 
> expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.test.MetricsAsserts.assertCounter(MetricsAsserts.java:228)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testEcTasks(TestDataNodeErasureCodingMetrics.java:92)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10434) Fix intermittent test failure of TestDataNodeErasureCodingMetrics

2016-08-03 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406645#comment-15406645
 ] 

Mingliang Liu commented on HDFS-10434:
--

And the test failure happens again at 
https://issues.apache.org/jira/browse/HDFS-10707?focusedCommentId=15400236=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15400236

Basically 10s is enough period to wait for the metrics updated. As [~yzhangal] 
mentioned, is there a different issue? Thanks.

> Fix intermittent test failure of TestDataNodeErasureCodingMetrics
> -
>
> Key: HDFS-10434
> URL: https://issues.apache.org/jira/browse/HDFS-10434
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Rakesh R
>Assignee: Rakesh R
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-10434-00.patch, HDFS-10434-01.patch
>
>
> This jira is to fix the test case failure.
> Reference : 
> [Build15485_TestDataNodeErasureCodingMetrics_testEcTasks|https://builds.apache.org/job/PreCommit-HDFS-Build/15485/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testEcTasks/]
> {code}
> Error Message
> Bad value for metric EcReconstructionTasks expected:<1> but was:<0>
> Stacktrace
> java.lang.AssertionError: Bad value for metric EcReconstructionTasks 
> expected:<1> but was:<0>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.test.MetricsAsserts.assertCounter(MetricsAsserts.java:228)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testEcTasks(TestDataNodeErasureCodingMetrics.java:92)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10718) Prefer direct ByteBuffer in native RS encoder and decoder

2016-08-03 Thread Kai Zheng (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406612#comment-15406612
 ] 

Kai Zheng commented on HDFS-10718:
--

Good catch, [~Sammi]! The patch looks great, +1 pending on Jenkins.

> Prefer direct ByteBuffer in native RS encoder and decoder
> -
>
> Key: HDFS-10718
> URL: https://issues.apache.org/jira/browse/HDFS-10718
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: HDFS-10718-v1.patch
>
>
> Enable ByteBuffer preference in Native Reed-Solomon encoder and encoder, to 
> improve the overall performance. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9271) Implement basic NN operations

2016-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406592#comment-15406592
 ] 

Hadoop QA commented on HDFS-9271:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
52s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
39s{color} | {color:green} HDFS-8707 passed with JDK v1.8.0_101 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
43s{color} | {color:green} HDFS-8707 passed with JDK v1.7.0_101 {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
16s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} HDFS-8707 passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  2m 
10s{color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK 
v1.8.0_101. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  2m 10s{color} | 
{color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.8.0_101. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  2m 10s{color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK 
v1.8.0_101. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  2m 
11s{color} | {color:red} hadoop-hdfs-native-client in the patch failed with JDK 
v1.7.0_101. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  2m 11s{color} | 
{color:red} hadoop-hdfs-native-client in the patch failed with JDK v1.7.0_101. 
{color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  2m 11s{color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK 
v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m  9s{color} 
| {color:red} hadoop-hdfs-native-client in the patch failed with JDK 
v1.7.0_101. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 30m 41s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0cf5e66 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12821915/HDFS-9271.HDFS-8707.007.patch
 |
| JIRA Issue | HDFS-9271 |
| Optional Tests |  asflicense  compile  cc  mvnsite  javac  unit  |
| uname | Linux b215b8079c91 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | HDFS-8707 / f1b591d |
| Default Java | 1.7.0_101 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_101 
/usr/lib/jvm/java-7-openjdk-amd64:1.7.0_101 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16313/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_101.txt
 |
| cc | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16313/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_101.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16313/artifact/patchprocess/patch-compile-hadoop-hdfs-project_hadoop-hdfs-native-client-jdk1.8.0_101.txt
 |
| compile | 

[jira] [Commented] (HDFS-10569) A bug causes OutOfIndex error in BlockListAsLongs

2016-08-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406582#comment-15406582
 ] 

Hudson commented on HDFS-10569:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #10208 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10208/])
HDFS-10569. A bug causes OutOfIndex error in BlockListAsLongs. (kihwal: rev 
6f63566694f8cec64a469448a8fa00ce921ce367)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/BlockListAsLongs.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/protocol/TestBlockListAsLongs.java


> A bug causes OutOfIndex error in BlockListAsLongs
> -
>
> Key: HDFS-10569
> URL: https://issues.apache.org/jira/browse/HDFS-10569
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Minor
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-10569.001.patch, HDFS-10569.002.patch, 
> HDFS-10569.003.patch, HDFS-10569.004.patch
>
>
> An obvious bug in LongsDecoder.getBlockListAsLongs(), the size of var *longs* 
> is +2 to the size of *values*, but the for-loop accesses *values* using 
> *longs* index. This will cause OutOfIndex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10569) A bug causes OutOfIndex error in BlockListAsLongs

2016-08-03 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10569:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   2.9.0
   2.8.0
   Status: Resolved  (was: Patch Available)

Thanks for reporting and fixing it, [~cheersyang].

> A bug causes OutOfIndex error in BlockListAsLongs
> -
>
> Key: HDFS-10569
> URL: https://issues.apache.org/jira/browse/HDFS-10569
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-10569.001.patch, HDFS-10569.002.patch, 
> HDFS-10569.003.patch, HDFS-10569.004.patch
>
>
> An obvious bug in LongsDecoder.getBlockListAsLongs(), the size of var *longs* 
> is +2 to the size of *values*, but the for-loop accesses *values* using 
> *longs* index. This will cause OutOfIndex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10569) A bug causes OutOfIndex error in BlockListAsLongs

2016-08-03 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10569:
--
Priority: Minor  (was: Major)

> A bug causes OutOfIndex error in BlockListAsLongs
> -
>
> Key: HDFS-10569
> URL: https://issues.apache.org/jira/browse/HDFS-10569
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
>Priority: Minor
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-10569.001.patch, HDFS-10569.002.patch, 
> HDFS-10569.003.patch, HDFS-10569.004.patch
>
>
> An obvious bug in LongsDecoder.getBlockListAsLongs(), the size of var *longs* 
> is +2 to the size of *values*, but the for-loop accesses *values* using 
> *longs* index. This will cause OutOfIndex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10569) A bug causes OutOfIndex error in BlockListAsLongs

2016-08-03 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406554#comment-15406554
 ] 

Kihwal Lee commented on HDFS-10569:
---

The failed test is passing for me with the patch. HDFS-10572 probably fixed it.
+1 the patch looks good.

> A bug causes OutOfIndex error in BlockListAsLongs
> -
>
> Key: HDFS-10569
> URL: https://issues.apache.org/jira/browse/HDFS-10569
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.0
>Reporter: Weiwei Yang
>Assignee: Weiwei Yang
> Attachments: HDFS-10569.001.patch, HDFS-10569.002.patch, 
> HDFS-10569.003.patch, HDFS-10569.004.patch
>
>
> An obvious bug in LongsDecoder.getBlockListAsLongs(), the size of var *longs* 
> is +2 to the size of *values*, but the for-loop accesses *values* using 
> *longs* index. This will cause OutOfIndex.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9271) Implement basic NN operations

2016-08-03 Thread James Clampffer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406547#comment-15406547
 ] 

James Clampffer commented on HDFS-9271:
---

Overall the patch looks good to me.  Just some really minor things:

1) nit: weird space between namespace and type in hdfs.cc
{code}
void set_working_directory(std:: string new_directory) { working_directory = 
new_directory; }
{code}

2) Right now hdfsFile_internal tracks bytes read.  Should this get pushed into 
FileHandle?

3) pedantic nit: 2 vs 4 space tabs in CheckSystemAndHandle
{code}
+  if (!CheckSystem(fs))
+return false;
+
+  if (!CheckHandle(file))
+  return false;
{code}

4) Might still be worth doing calling CheckSystem in hdfsAvailable even if it's 
not implemented so latent bugs don't show up in surprising ways once it is 
implemented.  Same thing for any other functions that are just stubbed in for 
now like hdfsUnbufferFile.

5) Might be worth turning NameNodeOperations::IsHighBitSet into a simple 
function in common/util.h.  There's other places where that would be nice to 
use like checking before casting an unsigned into a signed type (related to 
HDFS-10554).  Templating the function so it could check any integral type would 
be nice too but I can do that later on when I work on the conversion stuff.

Also if you're as bad at counting at counting a long row of 0s as I am it might 
be easier to replace the hex literal with something that makes your intention 
very clear:
{code}
uint64_t firstBit = 0x8000;
{code}
with
{code}
uint64_t firstBit = 0x1 << 63;
{code}
Any non-toy compiler will get rid of the shift with a constant folding pass.

> Implement basic NN operations
> -
>
> Key: HDFS-9271
> URL: https://issues.apache.org/jira/browse/HDFS-9271
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Anatoli Shein
> Attachments: HDFS-9271.HDFS-8707.000.patch, 
> HDFS-9271.HDFS-8707.001.patch, HDFS-9271.HDFS-8707.002.patch, 
> HDFS-9271.HDFS-8707.003.patch, HDFS-9271.HDFS-8707.004.patch, 
> HDFS-9271.HDFS-8707.005.patch, HDFS-9271.HDFS-8707.006.patch, 
> HDFS-9271.HDFS-8707.007.patch
>
>
> Expose via C and C++ API:
> * mkdirs
> * rename
> * delete
> * stat
> * chmod
> * chown
> * getListing
> * setOwner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9271) Implement basic NN operations

2016-08-03 Thread James Clampffer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-9271:
--
Attachment: HDFS-9271.HDFS-8707.007.patch

I went ahead and resolved some minor merge conflicts due to committing HA stuff 
so I could review.  Anatoli, can you make sure I got it right?  Really just the 
initializer list for Status and a warning message for HA.

> Implement basic NN operations
> -
>
> Key: HDFS-9271
> URL: https://issues.apache.org/jira/browse/HDFS-9271
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Bob Hansen
>Assignee: Anatoli Shein
> Attachments: HDFS-9271.HDFS-8707.000.patch, 
> HDFS-9271.HDFS-8707.001.patch, HDFS-9271.HDFS-8707.002.patch, 
> HDFS-9271.HDFS-8707.003.patch, HDFS-9271.HDFS-8707.004.patch, 
> HDFS-9271.HDFS-8707.005.patch, HDFS-9271.HDFS-8707.006.patch, 
> HDFS-9271.HDFS-8707.007.patch
>
>
> Expose via C and C++ API:
> * mkdirs
> * rename
> * delete
> * stat
> * chmod
> * chown
> * getListing
> * setOwner



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object

2016-08-03 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406539#comment-15406539
 ] 

Chen Liang commented on HDFS-10682:
---

The failed tests are unrelated.

> Replace FsDatasetImpl object lock with a separate lock object
> -
>
> Key: HDFS-10682
> URL: https://issues.apache.org/jira/browse/HDFS-10682
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-10682.001.patch, HDFS-10682.002.patch, 
> HDFS-10682.003.patch, HDFS-10682.004.patch, HDFS-10682.005.patch, 
> HDFS-10682.006.patch, HDFS-10682.007.patch, HDFS-10682.008.patch, 
> HDFS-10682.009.patch
>
>
> This Jira proposes to replace the FsDatasetImpl object lock with a separate 
> lock object. Doing so will make it easier to measure lock statistics like 
> lock held time and warn about potential lock contention due to slow disk 
> operations.
> In the future we can also consider replacing the lock with a read-write lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10710) In BlockManager#rescanPostponedMisreplicatedBlocks(), postponed misreplicated block counts should be retrieved with NN lock protection

2016-08-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406530#comment-15406530
 ] 

Hudson commented on HDFS-10710:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #10206 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10206/])
HDFS-10710. In BlockManager#rescanPostponedMisreplicatedBlocks(), (jing9: rev 
f4ba5ff1d70ef92d59851c09c4bd4b43d6c04971)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java


> In BlockManager#rescanPostponedMisreplicatedBlocks(), postponed misreplicated 
> block counts should be retrieved with NN lock protection
> --
>
> Key: HDFS-10710
> URL: https://issues.apache.org/jira/browse/HDFS-10710
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: GAO Rui
>Assignee: GAO Rui
> Fix For: 2.8.0
>
> Attachments: HDFS-10710.1.patch
>
>
> In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block 
> counts should be get with the protect with lock. Or, log records like "-1 
> blocks are removed" which indicate minus blocks are removed could be 
> generated. 
> For example, following scenario:
> 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = 
> getPostponedMisreplicatedBlocksCount();}} currently  
> startPostponedMisReplicatedBlocksCount get the value 20.
> 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment 
> postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount 
> is 21 now.
> 3. thread1 end the iteration, but no postponed block is removed, so after run 
> {{long endPostponedMisReplicatedBlocksCount = 
> getPostponedMisreplicatedBlocksCount();}}, 
> endPostponedMisReplicatedBlocksCount get the value of 21.
> 4. thread 1 generate the log:
> {noformat}
>   LOG.info("Rescan of postponedMisreplicatedBlocks completed in " +
>   (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) 
> +
>   " msecs. " + endPostponedMisReplicatedBlocksCount +
>   " blocks are left. " + (startPostponedMisReplicatedBlocksCount -
>   endPostponedMisReplicatedBlocksCount) + " blocks are removed.");
> {noformat}
> Then, we'll get the log record like "-1 blocks are removed."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object

2016-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406528#comment-15406528
 ] 

Hadoop QA commented on HDFS-10682:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
29s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 530 unchanged - 11 fixed = 530 total (was 541) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 51s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 79m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.tracing.TestTracing |
|   | hadoop.hdfs.TestFileChecksum |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12821885/HDFS-10682.009.patch |
| JIRA Issue | HDFS-10682 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux d49d7e1e4d1b 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 580a833 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16311/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16311/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16311/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Replace FsDatasetImpl object lock with a separate lock object
> -
>
> Key: HDFS-10682
> URL: https://issues.apache.org/jira/browse/HDFS-10682
> Project: Hadoop HDFS
>  Issue Type: 

[jira] [Updated] (HDFS-10710) In BlockManager#rescanPostponedMisreplicatedBlocks(), postponed misreplicated block counts should be retrieved with NN lock protection

2016-08-03 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-10710:
-
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.8.0
   Status: Resolved  (was: Patch Available)

I've committed the patch to trunk, branch-2, and branch-2.8. Thanks for the 
contribution, [~demongaorui]!

> In BlockManager#rescanPostponedMisreplicatedBlocks(), postponed misreplicated 
> block counts should be retrieved with NN lock protection
> --
>
> Key: HDFS-10710
> URL: https://issues.apache.org/jira/browse/HDFS-10710
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: GAO Rui
>Assignee: GAO Rui
> Fix For: 2.8.0
>
> Attachments: HDFS-10710.1.patch
>
>
> In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block 
> counts should be get with the protect with lock. Or, log records like "-1 
> blocks are removed" which indicate minus blocks are removed could be 
> generated. 
> For example, following scenario:
> 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = 
> getPostponedMisreplicatedBlocksCount();}} currently  
> startPostponedMisReplicatedBlocksCount get the value 20.
> 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment 
> postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount 
> is 21 now.
> 3. thread1 end the iteration, but no postponed block is removed, so after run 
> {{long endPostponedMisReplicatedBlocksCount = 
> getPostponedMisreplicatedBlocksCount();}}, 
> endPostponedMisReplicatedBlocksCount get the value of 21.
> 4. thread 1 generate the log:
> {noformat}
>   LOG.info("Rescan of postponedMisreplicatedBlocks completed in " +
>   (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) 
> +
>   " msecs. " + endPostponedMisReplicatedBlocksCount +
>   " blocks are left. " + (startPostponedMisReplicatedBlocksCount -
>   endPostponedMisReplicatedBlocksCount) + " blocks are removed.");
> {noformat}
> Then, we'll get the log record like "-1 blocks are removed."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10707) Replace org.apache.commons.io.Charsets with java.nio.charset.StandardCharsets

2016-08-03 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated HDFS-10707:

Attachment: HDFS-10707.branch-2.patch

fix style

> Replace org.apache.commons.io.Charsets with java.nio.charset.StandardCharsets
> -
>
> Key: HDFS-10707
> URL: https://issues.apache.org/jira/browse/HDFS-10707
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10707.2.patch, HDFS-10707.3.patch, 
> HDFS-10707.branch-2.patch, HDFS-10707.patch
>
>
> org.apache.commons.io.Charsets is deprecated in favor of 
> java.nio.charset.StandardCharsets



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10707) Replace org.apache.commons.io.Charsets with java.nio.charset.StandardCharsets

2016-08-03 Thread Vincent Poon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vincent Poon updated HDFS-10707:

Attachment: (was: HDFS-10707.branch-2.patch)

> Replace org.apache.commons.io.Charsets with java.nio.charset.StandardCharsets
> -
>
> Key: HDFS-10707
> URL: https://issues.apache.org/jira/browse/HDFS-10707
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 2.7.2
>Reporter: Vincent Poon
>Assignee: Vincent Poon
>Priority: Minor
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-10707.2.patch, HDFS-10707.3.patch, 
> HDFS-10707.branch-2.patch, HDFS-10707.patch
>
>
> org.apache.commons.io.Charsets is deprecated in favor of 
> java.nio.charset.StandardCharsets



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10710) In BlockManager#rescanPostponedMisreplicatedBlocks(), postponed misreplicated block counts should be retrieved with NN lock protection

2016-08-03 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-10710:
-
Summary: In BlockManager#rescanPostponedMisreplicatedBlocks(), postponed 
misreplicated block counts should be retrieved with NN lock protection  (was: 
In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block 
counts should be get with the protect with lock)

> In BlockManager#rescanPostponedMisreplicatedBlocks(), postponed misreplicated 
> block counts should be retrieved with NN lock protection
> --
>
> Key: HDFS-10710
> URL: https://issues.apache.org/jira/browse/HDFS-10710
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: GAO Rui
>Assignee: GAO Rui
> Attachments: HDFS-10710.1.patch
>
>
> In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block 
> counts should be get with the protect with lock. Or, log records like "-1 
> blocks are removed" which indicate minus blocks are removed could be 
> generated. 
> For example, following scenario:
> 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = 
> getPostponedMisreplicatedBlocksCount();}} currently  
> startPostponedMisReplicatedBlocksCount get the value 20.
> 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment 
> postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount 
> is 21 now.
> 3. thread1 end the iteration, but no postponed block is removed, so after run 
> {{long endPostponedMisReplicatedBlocksCount = 
> getPostponedMisreplicatedBlocksCount();}}, 
> endPostponedMisReplicatedBlocksCount get the value of 21.
> 4. thread 1 generate the log:
> {noformat}
>   LOG.info("Rescan of postponedMisreplicatedBlocks completed in " +
>   (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) 
> +
>   " msecs. " + endPostponedMisReplicatedBlocksCount +
>   " blocks are left. " + (startPostponedMisReplicatedBlocksCount -
>   endPostponedMisReplicatedBlocksCount) + " blocks are removed.");
> {noformat}
> Then, we'll get the log record like "-1 blocks are removed."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object

2016-08-03 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406370#comment-15406370
 ] 

Chen Liang commented on HDFS-10682:
---

Thanks for the review [~arpitagarwal]! Uploaded another patch to fix these.

> Replace FsDatasetImpl object lock with a separate lock object
> -
>
> Key: HDFS-10682
> URL: https://issues.apache.org/jira/browse/HDFS-10682
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-10682.001.patch, HDFS-10682.002.patch, 
> HDFS-10682.003.patch, HDFS-10682.004.patch, HDFS-10682.005.patch, 
> HDFS-10682.006.patch, HDFS-10682.007.patch, HDFS-10682.008.patch, 
> HDFS-10682.009.patch
>
>
> This Jira proposes to replace the FsDatasetImpl object lock with a separate 
> lock object. Doing so will make it easier to measure lock statistics like 
> lock held time and warn about potential lock contention due to slow disk 
> operations.
> In the future we can also consider replacing the lock with a read-write lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object

2016-08-03 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-10682:
--
Attachment: HDFS-10682.009.patch

> Replace FsDatasetImpl object lock with a separate lock object
> -
>
> Key: HDFS-10682
> URL: https://issues.apache.org/jira/browse/HDFS-10682
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-10682.001.patch, HDFS-10682.002.patch, 
> HDFS-10682.003.patch, HDFS-10682.004.patch, HDFS-10682.005.patch, 
> HDFS-10682.006.patch, HDFS-10682.007.patch, HDFS-10682.008.patch, 
> HDFS-10682.009.patch
>
>
> This Jira proposes to replace the FsDatasetImpl object lock with a separate 
> lock object. Doing so will make it easier to measure lock statistics like 
> lock held time and warn about potential lock contention due to slow disk 
> operations.
> In the future we can also consider replacing the lock with a read-write lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object

2016-08-03 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-10682:
--
Status: Patch Available  (was: In Progress)

> Replace FsDatasetImpl object lock with a separate lock object
> -
>
> Key: HDFS-10682
> URL: https://issues.apache.org/jira/browse/HDFS-10682
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-10682.001.patch, HDFS-10682.002.patch, 
> HDFS-10682.003.patch, HDFS-10682.004.patch, HDFS-10682.005.patch, 
> HDFS-10682.006.patch, HDFS-10682.007.patch, HDFS-10682.008.patch, 
> HDFS-10682.009.patch
>
>
> This Jira proposes to replace the FsDatasetImpl object lock with a separate 
> lock object. Doing so will make it easier to measure lock statistics like 
> lock held time and warn about potential lock contention due to slow disk 
> operations.
> In the future we can also consider replacing the lock with a read-write lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object

2016-08-03 Thread Chen Liang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Liang updated HDFS-10682:
--
Status: In Progress  (was: Patch Available)

> Replace FsDatasetImpl object lock with a separate lock object
> -
>
> Key: HDFS-10682
> URL: https://issues.apache.org/jira/browse/HDFS-10682
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-10682.001.patch, HDFS-10682.002.patch, 
> HDFS-10682.003.patch, HDFS-10682.004.patch, HDFS-10682.005.patch, 
> HDFS-10682.006.patch, HDFS-10682.007.patch, HDFS-10682.008.patch, 
> HDFS-10682.009.patch
>
>
> This Jira proposes to replace the FsDatasetImpl object lock with a separate 
> lock object. Doing so will make it easier to measure lock statistics like 
> lock held time and warn about potential lock contention due to slow disk 
> operations.
> In the future we can also consider replacing the lock with a read-write lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10619) Cache path in InodesInPath

2016-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406363#comment-15406363
 ] 

Hadoop QA commented on HDFS-10619:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 42s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m 29s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestEncryptionZones |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12817765/HDFS-10619.patch |
| JIRA Issue | HDFS-10619 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux ecd97a33e009 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / bebf10d |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16310/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16310/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16310/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Cache path in InodesInPath
> --
>
> Key: HDFS-10619
> URL: https://issues.apache.org/jira/browse/HDFS-10619
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
> 

[jira] [Commented] (HDFS-10674) Optimize creating a full path from an inode

2016-08-03 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406340#comment-15406340
 ] 

Kihwal Lee commented on HDFS-10674:
---

+1 the patch looks good.

> Optimize creating a full path from an inode
> ---
>
> Key: HDFS-10674
> URL: https://issues.apache.org/jira/browse/HDFS-10674
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10674.patch
>
>
> {{INode#getFullPathName}} walks up the inode tree, creates a INode[], 
> converts each component byte[] name to a String while building the path.  
> This involves many allocations, copies, and char conversions.
> The path should be built with a single byte[] allocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10674) Optimize creating a full path from an inode

2016-08-03 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10674:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   2.9.0
   Status: Resolved  (was: Patch Available)

I've committed this to trunk and branch-2. 

> Optimize creating a full path from an inode
> ---
>
> Key: HDFS-10674
> URL: https://issues.apache.org/jira/browse/HDFS-10674
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-10674.patch
>
>
> {{INode#getFullPathName}} walks up the inode tree, creates a INode[], 
> converts each component byte[] name to a String while building the path.  
> This involves many allocations, copies, and char conversions.
> The path should be built with a single byte[] allocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10662) Optimize UTF8 string/byte conversions

2016-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406333#comment-15406333
 ] 

Hadoop QA commented on HDFS-10662:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:blue}0{color} | {color:blue} patch {color} | {color:blue}  0m  
1s{color} | {color:blue} The patch file was not named according to hadoop's 
naming conventions. Please see https://wiki.apache.org/hadoop/HowToContribute 
for instructions. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
31s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
 8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
6s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m  
1s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m 11s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 93m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.server.blockmanagement.TestBlockManager |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12821862/HDFS-10662.patch.1 |
| JIRA Issue | HDFS-10662 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 02e9740e6eae 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2d82276 |
| Default Java | 1.8.0_101 |
| findbugs 

[jira] [Commented] (HDFS-10710) In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block counts should be get with the protect with lock

2016-08-03 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406326#comment-15406326
 ] 

Jing Zhao commented on HDFS-10710:
--

The patch looks good to me. The failed tests should be unrelated: 
{{TestDataNodeErasureCodingMetrics}} and {{TestDirectoryScanner}} failed 
intermittently, they also failed in several other jiras' Jenkins run. 

+1. I will commit the patch shortly.

> In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block 
> counts should be get with the protect with lock
> -
>
> Key: HDFS-10710
> URL: https://issues.apache.org/jira/browse/HDFS-10710
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Reporter: GAO Rui
>Assignee: GAO Rui
> Attachments: HDFS-10710.1.patch
>
>
> In BlockManager#rescanPostponedMisreplicatedBlocks(), start and end block 
> counts should be get with the protect with lock. Or, log records like "-1 
> blocks are removed" which indicate minus blocks are removed could be 
> generated. 
> For example, following scenario:
> 1. thread1 run {{long startPostponedMisReplicatedBlocksCount = 
> getPostponedMisreplicatedBlocksCount();}} currently  
> startPostponedMisReplicatedBlocksCount get the value 20.
> 2. before thread1 run {{namesystem.writeLock();}} , thread2 increment 
> postponedMisreplicatedBlocksCount by 1, so postponedMisreplicatedBlocksCount 
> is 21 now.
> 3. thread1 end the iteration, but no postponed block is removed, so after run 
> {{long endPostponedMisReplicatedBlocksCount = 
> getPostponedMisreplicatedBlocksCount();}}, 
> endPostponedMisReplicatedBlocksCount get the value of 21.
> 4. thread 1 generate the log:
> {noformat}
>   LOG.info("Rescan of postponedMisreplicatedBlocks completed in " +
>   (Time.monotonicNow() - startTimeRescanPostponedMisReplicatedBlocks) 
> +
>   " msecs. " + endPostponedMisReplicatedBlocksCount +
>   " blocks are left. " + (startPostponedMisReplicatedBlocksCount -
>   endPostponedMisReplicatedBlocksCount) + " blocks are removed.");
> {noformat}
> Then, we'll get the log record like "-1 blocks are removed."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10682) Replace FsDatasetImpl object lock with a separate lock object

2016-08-03 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406317#comment-15406317
 ] 

Arpit Agarwal commented on HDFS-10682:
--

Thanks for working through the patch iterations [~vagarychen]! The v8 patch is 
looking good. Just a few minor comments:
# AutoCloseableLock should not extend ReentrantLock, it already contains a lock 
object.
# Unused function FsDatasetSpi#releaseDatasetLock and descendants should be 
removed.

+1 with these addressed.

> Replace FsDatasetImpl object lock with a separate lock object
> -
>
> Key: HDFS-10682
> URL: https://issues.apache.org/jira/browse/HDFS-10682
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Chen Liang
>Assignee: Chen Liang
> Attachments: HDFS-10682.001.patch, HDFS-10682.002.patch, 
> HDFS-10682.003.patch, HDFS-10682.004.patch, HDFS-10682.005.patch, 
> HDFS-10682.006.patch, HDFS-10682.007.patch, HDFS-10682.008.patch
>
>
> This Jira proposes to replace the FsDatasetImpl object lock with a separate 
> lock object. Doing so will make it easier to measure lock statistics like 
> lock held time and warn about potential lock contention due to slow disk 
> operations.
> In the future we can also consider replacing the lock with a read-write lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-742) A down DataNode makes Balancer to hang on repeatingly asking NameNode its partial block list

2016-08-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406286#comment-15406286
 ] 

Hudson commented on HDFS-742:
-

SUCCESS: Integrated in Hadoop-trunk-Commit #10203 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10203/])
HDFS-742. A down DataNode makes Balancer to hang on repeatingly asking (kihwal: 
rev 58db263e93daf08280e6a586a10cebd6122cf72a)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/balancer/Dispatcher.java


> A down DataNode makes Balancer to hang on repeatingly asking NameNode its 
> partial block list
> 
>
> Key: HDFS-742
> URL: https://issues.apache.org/jira/browse/HDFS-742
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Reporter: Hairong Kuang
>Assignee: Mit Desai
>Priority: Minor
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-742-trunk.patch, HDFS-742.patch, 
> HDFS-742.v1.trunk.patch
>
>
> We had a balancer that had not made any progress for a long time. It turned 
> out it was repeatingly asking Namenode for a partial block list of one 
> datanode, which was done while the balancer was running.
> NameNode should notify Balancer that the datanode is not available and 
> Balancer should stop asking for the datanode's block list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10673) Optimize FSPermissionChecker's internal path usage

2016-08-03 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406266#comment-15406266
 ] 

Daryn Sharp commented on HDFS-10673:


[~jingzhao]?

> Optimize FSPermissionChecker's internal path usage
> --
>
> Key: HDFS-10673
> URL: https://issues.apache.org/jira/browse/HDFS-10673
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10673.1.patch, HDFS-10673.patch
>
>
> The INodeAttributeProvider and AccessControlEnforcer features degrade 
> performance and generate excessive garbage even when neither is used.  Main 
> issues:
> # A byte[][] of components is unnecessarily created.  Each path component 
> lookup converts a subrange of the byte[][] to a new String[] - then not used 
> by default attribute provider.
> # Subaccess checks are insanely expensive.  The full path of every subdir is 
> created by walking up the inode tree, creating a INode[], building a string 
> by converting each inode's byte[] name to a string, etc.  Which will only be 
> used if there's an exception.
> The expensive of #1 should only be incurred when using the provider/enforcer 
> feature.  For #2, paths should be created on-demand for exceptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10715) NPE when applying AvailableSpaceBlockPlacementPolicy

2016-08-03 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406272#comment-15406272
 ] 

Kihwal Lee commented on HDFS-10715:
---

[~zhuguangbin86], I've added you as a contributor and assigned this jira to you.

> NPE when applying AvailableSpaceBlockPlacementPolicy
> 
>
> Key: HDFS-10715
> URL: https://issues.apache.org/jira/browse/HDFS-10715
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
> Environment: cdh5.8.0
>Reporter: Guangbin Zhu
>Assignee: Guangbin Zhu
> Attachments: HDFS-10715.001.patch, HDFS-10715.002.patch, 
> HDFS-10715.003.patch, HDFS-10715.004.patch
>
>
> As HDFS-8131 introduced an AvailableSpaceBlockPlacementPolicy, but In some 
> cases, it caused NPE. 
> Here are my namenode daemon logs : 
> 2016-08-02 13:05:03,271 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 13 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
> 10.132.89.79:14001 Call#56 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.compareDataNode(AvailableSpaceBlockPlacementPolicy.java:95)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.chooseDataNode(AvailableSpaceBlockPlacementPolicy.java:80)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:665)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:572)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:457)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:367)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:242)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:114)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:130)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1606)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3315)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:679)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:489)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> I reviewed the source code, and found the bug in method chooseDataNode. 
> clusterMap.chooseRandom may return null, which cannot compare using equals 
> a.equals(b) method.  
> Though this exception can be caught, and then retry another call. I think 
> this bug should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10719) In HA, Namenode is failed to start If any of the Quorum hostname is unresolved.

2016-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406268#comment-15406268
 ] 

Hadoop QA commented on HDFS-10719:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 23s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 28 new + 14 unchanged - 0 fixed = 42 total (was 14) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch 19 line(s) with tabs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 37s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}100m 43s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestFsDatasetCache |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12821855/HDFS-10719-1.patch |
| JIRA Issue | HDFS-10719 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 785ba208c8e5 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 2d82276 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16308/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| whitespace | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16308/artifact/patchprocess/whitespace-tabs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16308/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16308/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16308/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   

[jira] [Commented] (HDFS-10674) Optimize creating a full path from an inode

2016-08-03 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406269#comment-15406269
 ] 

Daryn Sharp commented on HDFS-10674:


TestSnapshotFileLength is unrelated.  It's not about the path length, but the 
file size.

> Optimize creating a full path from an inode
> ---
>
> Key: HDFS-10674
> URL: https://issues.apache.org/jira/browse/HDFS-10674
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10674.patch
>
>
> {{INode#getFullPathName}} walks up the inode tree, creates a INode[], 
> converts each component byte[] name to a String while building the path.  
> This involves many allocations, copies, and char conversions.
> The path should be built with a single byte[] allocation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10715) NPE when applying AvailableSpaceBlockPlacementPolicy

2016-08-03 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10715:
--
Assignee: Guangbin Zhu

> NPE when applying AvailableSpaceBlockPlacementPolicy
> 
>
> Key: HDFS-10715
> URL: https://issues.apache.org/jira/browse/HDFS-10715
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
> Environment: cdh5.8.0
>Reporter: Guangbin Zhu
>Assignee: Guangbin Zhu
> Attachments: HDFS-10715.001.patch, HDFS-10715.002.patch, 
> HDFS-10715.003.patch, HDFS-10715.004.patch
>
>
> As HDFS-8131 introduced an AvailableSpaceBlockPlacementPolicy, but In some 
> cases, it caused NPE. 
> Here are my namenode daemon logs : 
> 2016-08-02 13:05:03,271 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 13 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
> 10.132.89.79:14001 Call#56 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.compareDataNode(AvailableSpaceBlockPlacementPolicy.java:95)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.chooseDataNode(AvailableSpaceBlockPlacementPolicy.java:80)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:665)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:572)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:457)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:367)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:242)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:114)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:130)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1606)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3315)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:679)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:489)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> I reviewed the source code, and found the bug in method chooseDataNode. 
> clusterMap.chooseRandom may return null, which cannot compare using equals 
> a.equals(b) method.  
> Though this exception can be caught, and then retry another call. I think 
> this bug should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-742) A down DataNode makes Balancer to hang on repeatingly asking NameNode its partial block list

2016-08-03 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-742?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-742:

   Resolution: Fixed
Fix Version/s: 3.0.0-alpha2
   2.9.0
   2.8.0
   Status: Resolved  (was: Patch Available)

Thanks for working on this [~mitdesai]. I've committed this to trunk, branch-2 
and branch-2.8. 

> A down DataNode makes Balancer to hang on repeatingly asking NameNode its 
> partial block list
> 
>
> Key: HDFS-742
> URL: https://issues.apache.org/jira/browse/HDFS-742
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Reporter: Hairong Kuang
>Assignee: Mit Desai
>Priority: Minor
> Fix For: 2.8.0, 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-742-trunk.patch, HDFS-742.patch, 
> HDFS-742.v1.trunk.patch
>
>
> We had a balancer that had not made any progress for a long time. It turned 
> out it was repeatingly asking Namenode for a partial block list of one 
> datanode, which was done while the balancer was running.
> NameNode should notify Balancer that the datanode is not available and 
> Balancer should stop asking for the datanode's block list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9781) FsDatasetImpl#getBlockReports can occasionally throw NullPointerException

2016-08-03 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406257#comment-15406257
 ] 

Daryn Sharp commented on HDFS-9781:
---

Yes, in {{FsDatasetImpl#getBlockReports()}}, this chunk:
{code}
List curVolumes = volumes.getVolumes();
for (FsVolumeSpi v : curVolumes) {
  builders.put(v.getStorageID(), BlockListAsLongs.builder(maxDataLength));
}
{code}

Should be moved into the synchronized block.  I thought I had fixed this

> FsDatasetImpl#getBlockReports can occasionally throw NullPointerException
> -
>
> Key: HDFS-9781
> URL: https://issues.apache.org/jira/browse/HDFS-9781
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 3.0.0-alpha1
> Environment: Jenkins
>Reporter: Wei-Chiu Chuang
>Assignee: Xiao Chen
> Attachments: HDFS-9781.01.patch
>
>
> FsDatasetImpl#getBlockReports occasionally throws NPE. The NPE caused 
> TestFsDatasetImpl#testRemoveVolumeBeingWritten to time out, because the test 
> waits for the call to FsDatasetImpl#getBlockReports to complete without 
> exceptions.
> Additionally, the test should be updated to identify an expected exception, 
> using {{GenericTestUtils.assertExceptionContains()}}
> {noformat}
> Exception in thread "Thread-20" java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockReports(FsDatasetImpl.java:1709)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl$1BlockReportThread.run(TestFsDatasetImpl.java:587)
> 2016-02-08 15:47:30,379 [Thread-21] WARN  impl.TestFsDatasetImpl 
> (TestFsDatasetImpl.java:run(606)) - Exception caught. This should not affect 
> the test
> java.io.IOException: Failed to move meta file for ReplicaBeingWritten, 
> blk_0_0, RBW
>   getNumBytes() = 0
>   getBytesOnDisk()  = 0
>   getVisibleLength()= 0
>   getVolume()   = 
> /home/weichiu/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/Nmi6rYndvr/data0/current
>   getBlockFile()= 
> /home/weichiu/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/Nmi6rYndvr/data0/current/bpid-0/current/rbw/blk_0
>   bytesAcked=0
>   bytesOnDisk=0 from 
> /home/weichiu/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/Nmi6rYndvr/data0/current/bpid-0/current/rbw/blk_0_0.meta
>  to 
> /home/weichiu/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/Nmi6rYndvr/data0/current/bpid-0/current/finalized/subdir0/subdir0/blk_0_0.meta
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.moveBlockFiles(FsDatasetImpl.java:857)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addFinalizedBlock(BlockPoolSlice.java:295)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.addFinalizedBlock(FsVolumeImpl.java:819)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.finalizeReplica(FsDatasetImpl.java:1620)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.finalizeBlock(FsDatasetImpl.java:1601)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl$1ResponderThread.run(TestFsDatasetImpl.java:603)
> Caused by: java.io.IOException: 
> renameTo(src=/home/weichiu/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/Nmi6rYndvr/data0/current/bpid-0/current/rbw/blk_0_0.meta,
>  
> dst=/home/weichiu/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/Nmi6rYndvr/data0/current/bpid-0/current/finalized/subdir0/subdir0/blk_0_0.meta)
>  failed.
> at org.apache.hadoop.io.nativeio.NativeIO.renameTo(NativeIO.java:873)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.moveBlockFiles(FsDatasetImpl.java:855)
> ... 5 more
> 2016-02-08 15:47:34,381 [Thread-19] INFO  impl.FsDatasetImpl 
> (FsVolumeList.java:waitVolumeRemoved(287)) - Volume reference is released.
> 2016-02-08 15:47:34,384 [Thread-19] INFO  impl.TestFsDatasetImpl 
> (TestFsDatasetImpl.java:testRemoveVolumeBeingWritten(622)) - Volumes removed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10656) Optimize conversion of byte arrays back to path string

2016-08-03 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406250#comment-15406250
 ] 

Hudson commented on HDFS-10656:
---

SUCCESS: Integrated in Hadoop-trunk-Commit #10202 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10202/])
HDFS-10656. Optimize conversion of byte arrays back to path string. (kihwal: 
rev bebf10d2455cad1aa8985553417d4d74a61150ee)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java


> Optimize conversion of byte arrays back to path string
> --
>
> Key: HDFS-10656
> URL: https://issues.apache.org/jira/browse/HDFS-10656
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-10656.patch
>
>
> {{DFSUtil.byteArray2PathString}} generates excessive object allocation.
> # each byte array is encoded to a string (copy)
> # string appended to a builder which extracts the chars from the intermediate 
> string (copy) and adds to its own char array
> # builder's char array is re-alloced if over 16 chars (copy)
> # builder's toString creates another string (copy)
> Instead of allocating all these objects and performing multiple byte/char 
> encoding/decoding conversions, the byte array can be built in-place with a 
> single final conversion to a string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-742) A down DataNode makes Balancer to hang on repeatingly asking NameNode its partial block list

2016-08-03 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406242#comment-15406242
 ] 

Daryn Sharp commented on HDFS-742:
--

+1 A triple-digit jira!

> A down DataNode makes Balancer to hang on repeatingly asking NameNode its 
> partial block list
> 
>
> Key: HDFS-742
> URL: https://issues.apache.org/jira/browse/HDFS-742
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Reporter: Hairong Kuang
>Assignee: Mit Desai
>Priority: Minor
> Attachments: HDFS-742-trunk.patch, HDFS-742.patch, 
> HDFS-742.v1.trunk.patch
>
>
> We had a balancer that had not made any progress for a long time. It turned 
> out it was repeatingly asking Namenode for a partial block list of one 
> datanode, which was done while the balancer was running.
> NameNode should notify Balancer that the datanode is not available and 
> Balancer should stop asking for the datanode's block list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-742) A down DataNode makes Balancer to hang on repeatingly asking NameNode its partial block list

2016-08-03 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406233#comment-15406233
 ] 

Kihwal Lee commented on HDFS-742:
-

The test failure is due to HDFS-9781.

> A down DataNode makes Balancer to hang on repeatingly asking NameNode its 
> partial block list
> 
>
> Key: HDFS-742
> URL: https://issues.apache.org/jira/browse/HDFS-742
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer & mover
>Reporter: Hairong Kuang
>Assignee: Mit Desai
>Priority: Minor
> Attachments: HDFS-742-trunk.patch, HDFS-742.patch, 
> HDFS-742.v1.trunk.patch
>
>
> We had a balancer that had not made any progress for a long time. It turned 
> out it was repeatingly asking Namenode for a partial block list of one 
> datanode, which was done while the balancer was running.
> NameNode should notify Balancer that the datanode is not available and 
> Balancer should stop asking for the datanode's block list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10619) Cache path in InodesInPath

2016-08-03 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406222#comment-15406222
 ] 

Kihwal Lee commented on HDFS-10619:
---

HDFS-10653 got committed. I will run this through precommit one more time.

> Cache path in InodesInPath
> --
>
> Key: HDFS-10619
> URL: https://issues.apache.org/jira/browse/HDFS-10619
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10619.patch
>
>
> INodesInPath#getPath, a frequently called method, dynamically builds the 
> path.  IIP should cache the path upon construction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10656) Optimize conversion of byte arrays back to path string

2016-08-03 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15406204#comment-15406204
 ] 

Kihwal Lee commented on HDFS-10656:
---

It looks like existing test cases (e.g. TestPathComponents) are sufficient.
+1

> Optimize conversion of byte arrays back to path string
> --
>
> Key: HDFS-10656
> URL: https://issues.apache.org/jira/browse/HDFS-10656
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10656.patch
>
>
> {{DFSUtil.byteArray2PathString}} generates excessive object allocation.
> # each byte array is encoded to a string (copy)
> # string appended to a builder which extracts the chars from the intermediate 
> string (copy) and adds to its own char array
> # builder's char array is re-alloced if over 16 chars (copy)
> # builder's toString creates another string (copy)
> Instead of allocating all these objects and performing multiple byte/char 
> encoding/decoding conversions, the byte array can be built in-place with a 
> single final conversion to a string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10656) Optimize conversion of byte arrays back to path string

2016-08-03 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-10656:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.0.0-alpha2
   2.9.0
   Status: Resolved  (was: Patch Available)

Committed this to trunk and branch-2. 

> Optimize conversion of byte arrays back to path string
> --
>
> Key: HDFS-10656
> URL: https://issues.apache.org/jira/browse/HDFS-10656
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Fix For: 2.9.0, 3.0.0-alpha2
>
> Attachments: HDFS-10656.patch
>
>
> {{DFSUtil.byteArray2PathString}} generates excessive object allocation.
> # each byte array is encoded to a string (copy)
> # string appended to a builder which extracts the chars from the intermediate 
> string (copy) and adds to its own char array
> # builder's char array is re-alloced if over 16 chars (copy)
> # builder's toString creates another string (copy)
> Instead of allocating all these objects and performing multiple byte/char 
> encoding/decoding conversions, the byte array can be built in-place with a 
> single final conversion to a string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10662) Optimize UTF8 string/byte conversions

2016-08-03 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-10662:
---
Attachment: HDFS-10662.patch.1

resolved minor conflicts.

> Optimize UTF8 string/byte conversions
> -
>
> Key: HDFS-10662
> URL: https://issues.apache.org/jira/browse/HDFS-10662
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
> Attachments: HDFS-10662.patch, HDFS-10662.patch.1
>
>
> String/byte conversions may take either a Charset instance or its canonical 
> name.  One might think a Charset instance would be faster due to avoiding a 
> lookup and instantiation of a Charset, but it's not.  The canonical string 
> name variants will cache the string encoder/decoder (obtained from a Charset) 
> resulting in better performance.
> LOG4J2-935 describes a real-world performance boost.  I micro-benched a 
> marginal runtime improvement on jdk 7/8.  However for a 16 byte path, using 
> the canonical name generated 50% less garbage.  For a 64 byte path, 25% of 
> the garbage.  Given the sheer number of times that paths are (re)parsed, the 
> cost adds up quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10719) In HA, Namenode is failed to start If any of the Quorum hostname is unresolved.

2016-08-03 Thread Karthik Palanisamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-10719:
--
Status: Patch Available  (was: Open)

> In HA, Namenode is failed to start If any of the Quorum hostname is 
> unresolved.
> ---
>
> Key: HDFS-10719
> URL: https://issues.apache.org/jira/browse/HDFS-10719
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node, namenode
>Affects Versions: 2.7.1
> Environment: HDP-2.4
>Reporter: Karthik Palanisamy
> Attachments: HDFS-10719-1.patch
>
>
> 2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
> Failed to start namenode.
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://1:8485;2:8485;3:8485/shva
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1635)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:178)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:156)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:367)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
> ... 18 more
> 2016-08-03 02:53:53,765 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
> Exiting with status 1
> 2016-08-03 02:53:53,768 INFO  namenode.NameNode (LogAdapter.java:info(47)) - 
> SHUTDOWN_MSG:
> *and the failover is not successful*
> I have attached the patch, It allows the Namenode to start if the majority of 
> the Quorums are resolvable and it throws the warning message for the quorum 
> which is not resolvable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10719) In HA, Namenode is failed to start If any of the Quorum hostname is unresolved.

2016-08-03 Thread Karthik Palanisamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-10719:
--
Description: 
2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
Failed to start namenode.
java.lang.IllegalArgumentException: Unable to construct journal, 
qjournal://1:8485;2:8485;3:8485/shva
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1635)
... 13 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:178)
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:156)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:367)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
... 18 more
2016-08-03 02:53:53,765 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
Exiting with status 1
2016-08-03 02:53:53,768 INFO  namenode.NameNode (LogAdapter.java:info(47)) - 
SHUTDOWN_MSG:

*and the failover is not successful*

I have attached the patch, It allows the Namenode to start if the majority of 
the Quorums are resolvable and it throws the warning message for the quorum 
which is not resolvable.



  was:
2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
Failed to start namenode.
java.lang.IllegalArgumentException: Unable to construct journal, 
qjournal://1:8485;2:8485;3:8485/shva
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
at 

[jira] [Updated] (HDFS-10719) In HA, Namenode is failed to start If any of the Quorum hostname is unresolved.

2016-08-03 Thread Karthik Palanisamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-10719:
--
Description: 
2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
Failed to start namenode.
java.lang.IllegalArgumentException: Unable to construct journal, 
qjournal://1:8485;2:8485;3:8485/shva
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1635)
... 13 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:178)
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:156)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:367)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
... 18 more
2016-08-03 02:53:53,765 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
Exiting with status 1
2016-08-03 02:53:53,768 INFO  namenode.NameNode (LogAdapter.java:info(47)) - 
SHUTDOWN_MSG:

and the *failover* is not successful.

I have attached the patch, It allows the Namenode to start if the majority of 
the Quorums are resolvable and it throws the warning message for the quorum 
which is not resolvable.



  was:
2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
Failed to start namenode.
java.lang.IllegalArgumentException: Unable to construct journal, 
qjournal://1:8485;2:8485;3:8485/shva
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
at 

[jira] [Updated] (HDFS-10719) In HA, Namenode is failed to start If any of the Quorum hostname is unresolved.

2016-08-03 Thread Karthik Palanisamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-10719:
--
Attachment: HDFS-10719-1.patch

> In HA, Namenode is failed to start If any of the Quorum hostname is 
> unresolved.
> ---
>
> Key: HDFS-10719
> URL: https://issues.apache.org/jira/browse/HDFS-10719
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node, namenode
>Affects Versions: 2.7.1
> Environment: HDP-2.4
>Reporter: Karthik Palanisamy
> Attachments: HDFS-10719-1.patch
>
>
> 2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
> Failed to start namenode.
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://1:8485;2:8485;3:8485/shva
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1635)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:178)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:156)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:367)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
> ... 18 more
> 2016-08-03 02:53:53,765 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
> Exiting with status 1
> 2016-08-03 02:53:53,768 INFO  namenode.NameNode (LogAdapter.java:info(47)) - 
> SHUTDOWN_MSG:
> and the *failover* is not successful.
> I have attached the patch, It allows the Namenode to start if the majority of 
> the Quorums are resolvable and it throws warning message for the quorum which 
> is not resolvable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10719) In HA, Namenode is failed to start If any of the Quorum hostname is unresolved.

2016-08-03 Thread Karthik Palanisamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-10719:
--
Attachment: (was: HADOOP-HDFS-10719-1-patch)

> In HA, Namenode is failed to start If any of the Quorum hostname is 
> unresolved.
> ---
>
> Key: HDFS-10719
> URL: https://issues.apache.org/jira/browse/HDFS-10719
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node, namenode
>Affects Versions: 2.7.1
> Environment: HDP-2.4
>Reporter: Karthik Palanisamy
>
> 2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
> Failed to start namenode.
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://1:8485;2:8485;3:8485/shva
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1635)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:178)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:156)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:367)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
> ... 18 more
> 2016-08-03 02:53:53,765 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
> Exiting with status 1
> 2016-08-03 02:53:53,768 INFO  namenode.NameNode (LogAdapter.java:info(47)) - 
> SHUTDOWN_MSG:
> and the *failover* is not successful.
> I have attached the patch, It allows the Namenode to start if the majority of 
> the Quorums are resolvable and it throws warning message for the quorum which 
> is not resolvable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10719) In HA, Namenode is failed to start If any of the Quorum hostname is unresolved.

2016-08-03 Thread Karthik Palanisamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-10719:
--
Affects Version/s: 2.7.1

> In HA, Namenode is failed to start If any of the Quorum hostname is 
> unresolved.
> ---
>
> Key: HDFS-10719
> URL: https://issues.apache.org/jira/browse/HDFS-10719
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node, namenode
>Affects Versions: 2.7.1
> Environment: HDP-2.4
>Reporter: Karthik Palanisamy
>
> 2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
> Failed to start namenode.
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://1:8485;2:8485;3:8485/shva
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1635)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:178)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:156)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:367)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
> ... 18 more
> 2016-08-03 02:53:53,765 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
> Exiting with status 1
> 2016-08-03 02:53:53,768 INFO  namenode.NameNode (LogAdapter.java:info(47)) - 
> SHUTDOWN_MSG:
> and the *failover* is not successful.
> I have attached the patch, It allows the Namenode to start if the majority of 
> the Quorums are resolvable and it throws warning message for the quorum which 
> is not resolvable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10719) In HA, Namenode is failed to start If any of the Quorum hostname is unresolved.

2016-08-03 Thread Karthik Palanisamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-10719:
--
Attachment: (was: HADOOP-HDFS-10719-2.patch)

> In HA, Namenode is failed to start If any of the Quorum hostname is 
> unresolved.
> ---
>
> Key: HDFS-10719
> URL: https://issues.apache.org/jira/browse/HDFS-10719
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node, namenode
>Affects Versions: 2.7.1
> Environment: HDP-2.4
>Reporter: Karthik Palanisamy
>
> 2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
> Failed to start namenode.
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://1:8485;2:8485;3:8485/shva
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1635)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:178)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:156)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:367)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
> ... 18 more
> 2016-08-03 02:53:53,765 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
> Exiting with status 1
> 2016-08-03 02:53:53,768 INFO  namenode.NameNode (LogAdapter.java:info(47)) - 
> SHUTDOWN_MSG:
> and the *failover* is not successful.
> I have attached the patch, It allows the Namenode to start if the majority of 
> the Quorums are resolvable and it throws warning message for the quorum which 
> is not resolvable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10719) In HA, Namenode is failed to start If any of the Quorum hostname is unresolved.

2016-08-03 Thread Karthik Palanisamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-10719:
--
Attachment: HADOOP-HDFS-10719-2.patch

> In HA, Namenode is failed to start If any of the Quorum hostname is 
> unresolved.
> ---
>
> Key: HDFS-10719
> URL: https://issues.apache.org/jira/browse/HDFS-10719
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node, namenode
> Environment: HDP-2.4
>Reporter: Karthik Palanisamy
> Attachments: HADOOP-HDFS-10719-1-patch, HADOOP-HDFS-10719-2.patch
>
>
> 2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
> Failed to start namenode.
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://1:8485;2:8485;3:8485/shva
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1635)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:178)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:156)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:367)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
> ... 18 more
> 2016-08-03 02:53:53,765 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
> Exiting with status 1
> 2016-08-03 02:53:53,768 INFO  namenode.NameNode (LogAdapter.java:info(47)) - 
> SHUTDOWN_MSG:
> and the *failover* is not successful.
> I have attached the patch, It allows the Namenode to start if the majority of 
> the Quorums are resolvable and it throws warning message for the quorum which 
> is not resolvable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10719) In HA, Namenode is failed to start If any of the Quorum hostname is unresolved.

2016-08-03 Thread Karthik Palanisamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palanisamy updated HDFS-10719:
--
Attachment: HADOOP-HDFS-10719-1-patch

> In HA, Namenode is failed to start If any of the Quorum hostname is 
> unresolved.
> ---
>
> Key: HDFS-10719
> URL: https://issues.apache.org/jira/browse/HDFS-10719
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: journal-node, namenode
> Environment: HDP-2.4
>Reporter: Karthik Palanisamy
> Attachments: HADOOP-HDFS-10719-1-patch
>
>
> 2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
> Failed to start namenode.
> java.lang.IllegalArgumentException: Unable to construct journal, 
> qjournal://1:8485;2:8485;3:8485/shva
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1635)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:178)
> at 
> org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:156)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:367)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
> at 
> org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
> ... 18 more
> 2016-08-03 02:53:53,765 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
> Exiting with status 1
> 2016-08-03 02:53:53,768 INFO  namenode.NameNode (LogAdapter.java:info(47)) - 
> SHUTDOWN_MSG:
> and the *failover* is not successful.
> I have attached the patch, It allows the Namenode to start if the majority of 
> the Quorums are resolvable and it throws warning message for the quorum which 
> is not resolvable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10719) In HA, Namenode is failed to start If any of the Quorum hostname is unresolved.

2016-08-03 Thread Karthik Palanisamy (JIRA)
Karthik Palanisamy created HDFS-10719:
-

 Summary: In HA, Namenode is failed to start If any of the Quorum 
hostname is unresolved.
 Key: HDFS-10719
 URL: https://issues.apache.org/jira/browse/HDFS-10719
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: journal-node, namenode
 Environment: HDP-2.4
Reporter: Karthik Palanisamy


2016-08-03 02:53:53,760 ERROR namenode.NameNode (NameNode.java:main(1712)) - 
Failed to start namenode.
java.lang.IllegalArgumentException: Unable to construct journal, 
qjournal://1:8485;2:8485;3:8485/shva
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1637)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initJournals(FSEditLog.java:282)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.initSharedJournalsForRead(FSEditLog.java:260)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.initEditLog(FSImage.java:789)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:634)
at 
org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:983)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:688)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:662)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:726)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:951)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:935)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1641)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1707)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.createJournal(FSEditLog.java:1635)
... 13 more
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.getName(IPCLoggerChannelMetrics.java:107)
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannelMetrics.create(IPCLoggerChannelMetrics.java:91)
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel.(IPCLoggerChannel.java:178)
at 
org.apache.hadoop.hdfs.qjournal.client.IPCLoggerChannel$1.createLogger(IPCLoggerChannel.java:156)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:367)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.createLoggers(QuorumJournalManager.java:149)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:116)
at 
org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.(QuorumJournalManager.java:105)
... 18 more
2016-08-03 02:53:53,765 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - 
Exiting with status 1
2016-08-03 02:53:53,768 INFO  namenode.NameNode (LogAdapter.java:info(47)) - 
SHUTDOWN_MSG:

and the *failover* is not successful.

I have attached the patch, It allows the Namenode to start if the majority of 
the Quorums are resolvable and it throws warning message for the quorum which 
is not resolvable.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-8498) Blocks can be committed with wrong size

2016-08-03 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405968#comment-15405968
 ] 

Vinayakumar B commented on HDFS-8498:
-

We experienced this in one of our testing cluster under high load.

*Scenario:*
Error occured for the HBase-RegionServer's WAL file.
1. In HBase there will be multiple threads performing the write,sync and close 
of same WAL file.
2. Actual writer writes the entries, multiple syncers call hsync on same stream 
and A roller thread rolls the WALs in regular intervals. i.e. close the curren 
WAL file and open another one for next entries.
3. During file close() by roller, last block got committed with less size, than 
present in all DNs.
4. All IBRs reported by DNs have more length, than that of COMMITTED length by 
the client. So all those replicas are marked as CORRUPT.
5. We use IBR batch with {{dfs.namenode.file.close.num-committed-allowed=1}}. 
So Client(HBase-RS) did not experience any problem, as file got closed 
successfully without waiting for the Correct IBR for last block.

*Current Analysis:*

HDFS-9289, safegaurded the {{DataStreamer#block}}'s re-assignment during 
pipeline update by making it volatile. But it did not actually protected the 
contents of the {{block}}.

*Suspected problem is:*
1. ResponseProcessor updated the block size by updating the numBytes after 
receiving every Ack by calling {{ExtendedBlock.setNumBytes()}}, which 
internally updates the numBytes of internal {{block}} which is not thread safe.
2. LogRoller calls close by by passing {{DataStreamer#block}} as last block. 
During this time, GUESS is that {{ExtendedBlock.getNumBytes()}} is not 
returning the latest value updated by ReponseProcessor, instead returning some 
of the earlier update. Because ExtendedBlock and its internal block is not 
threadsafe.
By this lesser size, Block is getting COMMITTED at NameNode and all IBRs are 
getting marked as CORRUPT.

*Possible solution:*
Make the ExtendedBlock threadsafe for setNumBytes() and getNumBytes().

If the above analysis makes sense, then we can raise one Jira and contribute 
the fix.

Note:
This issue we got in 40-core/380GB-RAM machine thrice. Trying to reproduce 
again with more logs, but no luck till now.
Once it was reproduced with DEBUG logs as well, from that its confirmed that 
complete() call is sent only after receiving all ACKs. But DEBUG logs was 
having no information of numBytes sent during complete(). So could not actually 
verify that this would be the fix.

> Blocks can be committed with wrong size
> ---
>
> Key: HDFS-8498
> URL: https://issues.apache.org/jira/browse/HDFS-8498
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.5.0
>Reporter: Daryn Sharp
>Assignee: Daryn Sharp
>Priority: Critical
>
> When an IBR for a UC block arrives, the NN updates the expected location's 
> block and replica state _only_ if it's on an unexpected storage for an 
> expected DN.  If it's for an expected storage, only the genstamp is updated.  
> When the block is committed, and the expected locations are verified, only 
> the genstamp is checked.  The size is not checked but it wasn't updated in 
> the expected locations anyway.
> A faulty client may misreport the size when committing the block.  The block 
> is effectively corrupted.  If the NN issues replications, the received IBR is 
> considered corrupt, the NN invalidates the block, immediately issues another 
> replication.  The NN eventually realizes all the original replicas are 
> corrupt after full BRs are received from the original DNs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10625) VolumeScanner to report why a block is found bad

2016-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405915#comment-15405915
 ] 

Hadoop QA commented on HDFS-10625:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 58 unchanged - 1 fixed = 60 total (was 59) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 16s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}100m  7s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12821820/HDFS-10625.004.patch |
| JIRA Issue | HDFS-10625 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux fb852dea8bdd 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d848184 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16307/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16307/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16307/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16307/console |
| Powered by | 

[jira] [Updated] (HDFS-10625) VolumeScanner to report why a block is found bad

2016-08-03 Thread Yiqun Lin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yiqun Lin updated HDFS-10625:
-
Attachment: HDFS-10625.004.patch

>  VolumeScanner to report why a block is found bad
> -
>
> Key: HDFS-10625
> URL: https://issues.apache.org/jira/browse/HDFS-10625
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, hdfs
>Reporter: Yongjun Zhang
>Assignee: Rushabh S Shah
>  Labels: supportability
> Attachments: HDFS-10625-1.patch, HDFS-10625.003.patch, 
> HDFS-10625.004.patch, HDFS-10625.patch
>
>
> VolumeScanner may report:
> {code}
> WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> blk_1170125248_96458336 on /d/dfs/dn
> {code}
> It would be helpful to report the reason why the block is bad, especially 
> when the block is corrupt, where is the first corrupted chunk in the block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10625) VolumeScanner to report why a block is found bad

2016-08-03 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405774#comment-15405774
 ] 

Yiqun Lin commented on HDFS-10625:
--

Thanks a lot for the review, [~vinayrpet]. Post a new patch for addressing 
that. I believe this patch will be the final patch of this jira, :).

>  VolumeScanner to report why a block is found bad
> -
>
> Key: HDFS-10625
> URL: https://issues.apache.org/jira/browse/HDFS-10625
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, hdfs
>Reporter: Yongjun Zhang
>Assignee: Rushabh S Shah
>  Labels: supportability
> Attachments: HDFS-10625-1.patch, HDFS-10625.003.patch, 
> HDFS-10625.004.patch, HDFS-10625.patch
>
>
> VolumeScanner may report:
> {code}
> WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> blk_1170125248_96458336 on /d/dfs/dn
> {code}
> It would be helpful to report the reason why the block is bad, especially 
> when the block is corrupt, where is the first corrupted chunk in the block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10586) Erasure Code misfunctions when 3 DataNode down

2016-08-03 Thread gao shan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405759#comment-15405759
 ] 

gao shan commented on HDFS-10586:
-

By running "hadoop fsck" , I find the following info.  It shows  the file 
/gaotest9/io_data/test_io_7  does not have 9 replicated blocks,   The machine  
"192.168.122.57" is lost for this file. 


/gaotest9/io_data/test_io_5 536870912 bytes, 1 block(s):  OK
0. BP-1490713436-192.168.122.69-1469603790713:blk_-9223372036854726320_8994 
len=536870912 Live_repl=9  [/default-rack/192.168.122.200:50010, 
/default-rack/192.168.122.214:50010, /default-rack/192.168.122.43:50010, 
/default-rack/192.168.122.57:50010, /default-rack/192.168.122.23:50010, 
/default-rack/192.168.122.198:50010, /default-rack/192.168.122.92:50010, 
/default-rack/192.168.122.111:50010, /default-rack/192.168.122.178:50010]

/gaotest9/io_data/test_io_6 536870912 bytes, 1 block(s):  OK
0. BP-1490713436-192.168.122.69-1469603790713:blk_-9223372036854726256_8998 
len=536870912 Live_repl=9  [/default-rack/192.168.122.214:50010, 
/default-rack/192.168.122.43:50010, /default-rack/192.168.122.57:50010, 
/default-rack/192.168.122.111:50010, /default-rack/192.168.122.23:50010, 
/default-rack/192.168.122.200:50010, /default-rack/192.168.122.178:50010, 
/default-rack/192.168.122.198:50010, /default-rack/192.168.122.92:50010]

/gaotest9/io_data/test_io_7 536870912 bytes, 1 block(s):  Under replicated 
BP-1490713436-192.168.122.69-1469603790713:blk_-9223372036854726064_9013. 
Target Replicas is 9 but found 8 live replica(s), 0 decommissioned replica(s) 
and 0 decommissioning replica(s).
0. BP-1490713436-192.168.122.69-1469603790713:blk_-9223372036854726064_9013 
len=536870912 Live_repl=8  [/default-rack/192.168.122.178:50010, 
/default-rack/192.168.122.198:50010, /default-rack/192.168.122.92:50010, 
/default-rack/192.168.122.214:50010, /default-rack/192.168.122.23:50010, 
/default-rack/192.168.122.200:50010, /default-rack/192.168.122.111:50010, 
/default-rack/192.168.122.43:50010]

/gaotest9/io_data/test_io_8 536870912 bytes, 1 block(s):  OK
0. BP-1490713436-192.168.122.69-1469603790713:blk_-9223372036854726048_9012 
len=536870912 Live_repl=9  [/default-rack/192.168.122.92:50010, 
/default-rack/192.168.122.198:50010, /default-rack/192.168.122.178:50010, 
/default-rack/192.168.122.200:50010, /default-rack/192.168.122.214:50010, 
/default-rack/192.168.122.23:50010, /default-rack/192.168.122.57:50010, 
/default-rack/192.168.122.43:50010, /default-rack/192.168.122.111:50010]



> Erasure Code misfunctions when 3 DataNode down
> --
>
> Key: HDFS-10586
> URL: https://issues.apache.org/jira/browse/HDFS-10586
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: erasure-coding
>Affects Versions: 3.0.0-alpha1
> Environment: 9 DataNode and 1 NameNode,Erasured code policy is 
> set as "6--3",   When 3 DataNode down,  erasured code fails and an exception 
> is thrown
>Reporter: gao shan
>
> The following is the steps to reproduce:
> 1) hadoop fs -mkdir /ec
> 2) set erasured code policy as "6-3"
> 3) "write" data by : 
> time hadoop jar 
> /opt/hadoop/hadoop-3.0.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT.jar
>   TestDFSIO -D test.build.data=/ec -write -nrFiles 30 -fileSize 12288 
> -bufferSize 1073741824
> 4) Manually down 3 nodes.  Kill the threads of  "datanode" and "nodemanager" 
> in 3 DataNode.
> 5) By using erasured code to "read" data by:
> time hadoop jar 
> /opt/hadoop/hadoop-3.0.0-SNAPSHOT/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT.jar
>   TestDFSIO -D test.build.data=/ec -read -nrFiles 30 -fileSize 12288 
> -bufferSize 1073741824
> then the failure occurs and the exception is thrown as:
> INFO mapreduce.Job: Task Id : attempt_1465445965249_0008_m_34_2, Status : 
> FAILED
> Error: java.io.IOException: 4 missing blocks, the stripe is: Offset=0, 
> length=8388608, fetchedChunksNum=0, missingChunksNum=4
>   at 
> org.apache.hadoop.hdfs.DFSStripedInputStream$StripeReader.checkMissingBlocks(DFSStripedInputStream.java:614)
>   at 
> org.apache.hadoop.hdfs.DFSStripedInputStream$StripeReader.readParityChunks(DFSStripedInputStream.java:647)
>   at 
> org.apache.hadoop.hdfs.DFSStripedInputStream$StripeReader.readStripe(DFSStripedInputStream.java:762)
>   at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readOneStripe(DFSStripedInputStream.java:316)
>   at 
> org.apache.hadoop.hdfs.DFSStripedInputStream.readWithStrategy(DFSStripedInputStream.java:450)
>   at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:941)
>   

[jira] [Commented] (HDFS-10715) NPE when applying AvailableSpaceBlockPlacementPolicy

2016-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405692#comment-15405692
 ] 

Hadoop QA commented on HDFS-10715:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
54s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 58m 10s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 78m  8s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12821800/HDFS-10715.004.patch |
| JIRA Issue | HDFS-10715 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 55a5fae978a0 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d848184 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16305/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16305/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16305/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> NPE when applying AvailableSpaceBlockPlacementPolicy
> 
>
> Key: HDFS-10715
> URL: https://issues.apache.org/jira/browse/HDFS-10715
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
> Environment: cdh5.8.0
>Reporter: 

[jira] [Updated] (HDFS-10718) Prefer direct ByteBuffer in native RS encoder and decoder

2016-08-03 Thread SammiChen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SammiChen updated HDFS-10718:
-
Attachment: HDFS-10718-v1.patch

> Prefer direct ByteBuffer in native RS encoder and decoder
> -
>
> Key: HDFS-10718
> URL: https://issues.apache.org/jira/browse/HDFS-10718
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: SammiChen
>Assignee: SammiChen
> Attachments: HDFS-10718-v1.patch
>
>
> Enable ByteBuffer preference in Native Reed-Solomon encoder and encoder, to 
> improve the overall performance. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10718) Prefer direct ByteBuffer in native RS encoder and decoder

2016-08-03 Thread SammiChen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

SammiChen updated HDFS-10718:
-
Status: Patch Available  (was: Open)

> Prefer direct ByteBuffer in native RS encoder and decoder
> -
>
> Key: HDFS-10718
> URL: https://issues.apache.org/jira/browse/HDFS-10718
> Project: Hadoop HDFS
>  Issue Type: Task
>Reporter: SammiChen
>Assignee: SammiChen
>
> Enable ByteBuffer preference in Native Reed-Solomon encoder and encoder, to 
> improve the overall performance. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10715) NPE when applying AvailableSpaceBlockPlacementPolicy

2016-08-03 Thread Guangbin Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangbin Zhu updated HDFS-10715:

Attachment: HDFS-10715.004.patch

fix checkstyle

> NPE when applying AvailableSpaceBlockPlacementPolicy
> 
>
> Key: HDFS-10715
> URL: https://issues.apache.org/jira/browse/HDFS-10715
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
> Environment: cdh5.8.0
>Reporter: Guangbin Zhu
> Attachments: HDFS-10715.001.patch, HDFS-10715.002.patch, 
> HDFS-10715.003.patch, HDFS-10715.004.patch
>
>
> As HDFS-8131 introduced an AvailableSpaceBlockPlacementPolicy, but In some 
> cases, it caused NPE. 
> Here are my namenode daemon logs : 
> 2016-08-02 13:05:03,271 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 13 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
> 10.132.89.79:14001 Call#56 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.compareDataNode(AvailableSpaceBlockPlacementPolicy.java:95)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.chooseDataNode(AvailableSpaceBlockPlacementPolicy.java:80)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:665)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:572)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:457)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:367)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:242)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:114)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:130)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1606)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3315)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:679)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:489)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> I reviewed the source code, and found the bug in method chooseDataNode. 
> clusterMap.chooseRandom may return null, which cannot compare using equals 
> a.equals(b) method.  
> Though this exception can be caught, and then retry another call. I think 
> this bug should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-4176) EditLogTailer should call rollEdits with a timeout

2016-08-03 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405585#comment-15405585
 ] 

Surendra Singh Lilhore commented on HDFS-4176:
--

Hi [~eddyxu]

One minor comments.

{code}
+  public static final String DFS_HA_TAILEDITS_ROLLEDITS_TIMEOUT_KEY =
+  "dfs.ha.tail-edits.rolledits.timeout";
{code}

Can we rename this property to {{dfs.ha.log-roll.execution.timeout}} ?. It will 
be in sync with {{dfs.ha.log-roll.rpc.timeout}}

> EditLogTailer should call rollEdits with a timeout
> --
>
> Key: HDFS-4176
> URL: https://issues.apache.org/jira/browse/HDFS-4176
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode
>Affects Versions: 2.0.2-alpha, 3.0.0-alpha1
>Reporter: Todd Lipcon
>Assignee: Lei (Eddy) Xu
> Attachments: HDFS-4176-branch-2.0.patch, 
> HDFS-4176-branch-2.003.patch, HDFS-4176-branch-2.1.patch, 
> HDFS-4176-branch-2.2.patch, HDFS-4176.00.patch, HDFS-4176.01.patch, 
> HDFS-4176.02.patch, HDFS-4176.03.patch, HDFS-4176.04.patch, namenode.jstack4
>
>
> When the EditLogTailer thread calls rollEdits() on the active NN via RPC, it 
> currently does so without a timeout. So, if the active NN has frozen (but not 
> actually crashed), this call can hang forever. This can then potentially 
> prevent the standby from becoming active.
> This may actually considered a side effect of HADOOP-6762 -- if the RPC were 
> interruptible, that would also fix the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10625) VolumeScanner to report why a block is found bad

2016-08-03 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405567#comment-15405567
 ] 

Vinayakumar B commented on HDFS-10625:
--

one nit, I thought it was already mentioned.
{{replicaInfoString.append(" for replica: \n" + replica.toString());}}
"\n" can be avoided here, as it helps to grep logs with complete message.

+1 once addressed.

>  VolumeScanner to report why a block is found bad
> -
>
> Key: HDFS-10625
> URL: https://issues.apache.org/jira/browse/HDFS-10625
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, hdfs
>Reporter: Yongjun Zhang
>Assignee: Rushabh S Shah
>  Labels: supportability
> Attachments: HDFS-10625-1.patch, HDFS-10625.003.patch, 
> HDFS-10625.patch
>
>
> VolumeScanner may report:
> {code}
> WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> blk_1170125248_96458336 on /d/dfs/dn
> {code}
> It would be helpful to report the reason why the block is bad, especially 
> when the block is corrupt, where is the first corrupted chunk in the block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10715) NPE when applying AvailableSpaceBlockPlacementPolicy

2016-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405547#comment-15405547
 ] 

Hadoop QA commented on HDFS-10715:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
8s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 25s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 30 unchanged - 0 fixed = 32 total (was 30) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 65m  5s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m 12s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestEditLog |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12821785/HDFS-10715.003.patch |
| JIRA Issue | HDFS-10715 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux 0abc293cddde 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d848184 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16304/artifact/patchprocess/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16304/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16304/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16304/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> NPE when applying AvailableSpaceBlockPlacementPolicy
> 
>
> Key: HDFS-10715
> URL: 

[jira] [Commented] (HDFS-10625) VolumeScanner to report why a block is found bad

2016-08-03 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405522#comment-15405522
 ] 

Yiqun Lin commented on HDFS-10625:
--

Thanks [~yzhangal] and [~shahrs87] for the review.
Hi, [~vinayrpet], does the latest patch also looks good to you? If yes, I think 
we can commit this to trunk.

>  VolumeScanner to report why a block is found bad
> -
>
> Key: HDFS-10625
> URL: https://issues.apache.org/jira/browse/HDFS-10625
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, hdfs
>Reporter: Yongjun Zhang
>Assignee: Rushabh S Shah
>  Labels: supportability
> Attachments: HDFS-10625-1.patch, HDFS-10625.003.patch, 
> HDFS-10625.patch
>
>
> VolumeScanner may report:
> {code}
> WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad 
> blk_1170125248_96458336 on /d/dfs/dn
> {code}
> It would be helpful to report the reason why the block is bad, especially 
> when the block is corrupt, where is the first corrupted chunk in the block.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10645) Make block report size as a metric and add this metric to datanode web ui

2016-08-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405476#comment-15405476
 ] 

Hadoop QA commented on HDFS-10645:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  8m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  7m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
54s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  8m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  8m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  7m 46s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 69m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}132m 47s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.log.TestLogLevel |
|   | hadoop.hdfs.qjournal.TestNNWithQJM |
|   | hadoop.hdfs.qjournal.TestSecureNNWithQJM |
|   | hadoop.hdfs.server.balancer.TestBalancer |
|   | hadoop.hdfs.TestDecommissionWithStriped |
|   | hadoop.hdfs.TestRenameWhileOpen |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12821769/HDFS-10645.007.patch |
| JIRA Issue | HDFS-10645 |
| Optional Tests |  asflicense  mvnsite  compile  javac  javadoc  mvninstall  
unit  findbugs  checkstyle  |
| uname | Linux f9c455d757e9 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 
Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / d28c2d9 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16303/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16303/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test 

[jira] [Created] (HDFS-10718) Prefer direct ByteBuffer in native RS encoder and decoder

2016-08-03 Thread SammiChen (JIRA)
SammiChen created HDFS-10718:


 Summary: Prefer direct ByteBuffer in native RS encoder and decoder
 Key: HDFS-10718
 URL: https://issues.apache.org/jira/browse/HDFS-10718
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: SammiChen
Assignee: SammiChen


Enable ByteBuffer preference in Native Reed-Solomon encoder and encoder, to 
improve the overall performance. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10715) NPE when applying AvailableSpaceBlockPlacementPolicy

2016-08-03 Thread Guangbin Zhu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangbin Zhu updated HDFS-10715:

Attachment: HDFS-10715.003.patch

code refactoring 

> NPE when applying AvailableSpaceBlockPlacementPolicy
> 
>
> Key: HDFS-10715
> URL: https://issues.apache.org/jira/browse/HDFS-10715
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
> Environment: cdh5.8.0
>Reporter: Guangbin Zhu
> Attachments: HDFS-10715.001.patch, HDFS-10715.002.patch, 
> HDFS-10715.003.patch
>
>
> As HDFS-8131 introduced an AvailableSpaceBlockPlacementPolicy, but In some 
> cases, it caused NPE. 
> Here are my namenode daemon logs : 
> 2016-08-02 13:05:03,271 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 13 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
> 10.132.89.79:14001 Call#56 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.compareDataNode(AvailableSpaceBlockPlacementPolicy.java:95)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.chooseDataNode(AvailableSpaceBlockPlacementPolicy.java:80)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:665)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:572)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:457)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:367)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:242)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:114)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:130)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1606)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3315)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:679)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:489)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> I reviewed the source code, and found the bug in method chooseDataNode. 
> clusterMap.chooseRandom may return null, which cannot compare using equals 
> a.equals(b) method.  
> Though this exception can be caught, and then retry another call. I think 
> this bug should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10715) NPE when applying AvailableSpaceBlockPlacementPolicy

2016-08-03 Thread Guangbin Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405391#comment-15405391
 ] 

Guangbin Zhu commented on HDFS-10715:
-

Thanks Akira, I'll fix it and submit another patch.

> NPE when applying AvailableSpaceBlockPlacementPolicy
> 
>
> Key: HDFS-10715
> URL: https://issues.apache.org/jira/browse/HDFS-10715
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
> Environment: cdh5.8.0
>Reporter: Guangbin Zhu
> Attachments: HDFS-10715.001.patch, HDFS-10715.002.patch
>
>
> As HDFS-8131 introduced an AvailableSpaceBlockPlacementPolicy, but In some 
> cases, it caused NPE. 
> Here are my namenode daemon logs : 
> 2016-08-02 13:05:03,271 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 13 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
> 10.132.89.79:14001 Call#56 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.compareDataNode(AvailableSpaceBlockPlacementPolicy.java:95)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.chooseDataNode(AvailableSpaceBlockPlacementPolicy.java:80)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:665)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:572)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:457)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:367)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:242)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:114)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:130)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1606)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3315)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:679)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:489)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> I reviewed the source code, and found the bug in method chooseDataNode. 
> clusterMap.chooseRandom may return null, which cannot compare using equals 
> a.equals(b) method.  
> Though this exception can be caught, and then retry another call. I think 
> this bug should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10715) NPE when applying AvailableSpaceBlockPlacementPolicy

2016-08-03 Thread Guangbin Zhu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405390#comment-15405390
 ] 

Guangbin Zhu commented on HDFS-10715:
-

Thanks Akira, I'll fix it and submit another patch.

> NPE when applying AvailableSpaceBlockPlacementPolicy
> 
>
> Key: HDFS-10715
> URL: https://issues.apache.org/jira/browse/HDFS-10715
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.8.0
> Environment: cdh5.8.0
>Reporter: Guangbin Zhu
> Attachments: HDFS-10715.001.patch, HDFS-10715.002.patch
>
>
> As HDFS-8131 introduced an AvailableSpaceBlockPlacementPolicy, but In some 
> cases, it caused NPE. 
> Here are my namenode daemon logs : 
> 2016-08-02 13:05:03,271 WARN org.apache.hadoop.ipc.Server: IPC Server handler 
> 13 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 
> 10.132.89.79:14001 Call#56 Retry#0
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.compareDataNode(AvailableSpaceBlockPlacementPolicy.java:95)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.AvailableSpaceBlockPlacementPolicy.chooseDataNode(AvailableSpaceBlockPlacementPolicy.java:80)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:665)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseLocalRack(BlockPlacementPolicyDefault.java:572)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTargetInOrder(BlockPlacementPolicyDefault.java:457)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:367)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:242)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:114)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:130)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1606)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3315)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:679)
> at 
> org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:489)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080)
> I reviewed the source code, and found the bug in method chooseDataNode. 
> clusterMap.chooseRandom may return null, which cannot compare using equals 
> a.equals(b) method.  
> Though this exception can be caught, and then retry another call. I think 
> this bug should be fixed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org