[jira] [Commented] (HDFS-7999) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time

2015-03-31 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388080#comment-14388080
 ] 

zhouyingchao commented on HDFS-7999:


Thank you for looking into the patch.  Here is some explain of the logic of 
createTemporary() after applying the patch:
1.  If there is no ReplicaInfo in volumeMap for the passed in ExtendedBlock b, 
then we will create one, insert into volumeMap and then return from line 1443.
2.  If there is a ReplicaInfo in volumeMap and its GS is newer than the passed 
in ExtendedBlock b, then throw the ReplicaAlreadyExistsException from line 1447.
3.  If there is a ReplicaInfo in volumeMap whereas its GS is older than the 
passed in ExtendedBlock b, then it means this is a new write and the earlier 
writer should be stopped.  We will release the FsDatasetImpl lock and try to 
stop the earlier writer w/o the lock.  
4.  After the earlier writer is stopped, we need to evict earlier writer's 
ReplicaInfo from volumeMap, to that end we will re-acquire the FsDatasetImpl 
lock.  However,  since this thread has released the FsDatasetImpl lock when it 
tried to stop earlier writer, another thread might have come in and changed the 
ReplicaInfo of this block in VolumeMap.  This situation is not very likely to 
happen whereas we have to handle it in case.   The loop in the patch is just 
tried to handle this situation -- after re-acuire the FsDatasetImpl lock, it 
will check if the current ReplicaInfo in volumeMap is still the one before we 
stop the writer, if so we can simply evict it and create/insert a new one then 
return from line 1443. Otherwise, it implies another thread has slipped in and 
changed the ReplicaInfo when we were stopping earlier writer.  In this 
condition, we check if that thread has inserted a block with even newer GS than 
us, if so we throws ReplicaAlreadyExistsException from line 1447. Otherwise we 
need to stop that thread's write just like we stop the earlier writer in step 3.


 FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a 
 very long time
 -

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 

[jira] [Commented] (HDFS-7889) Subclass DFSOutputStream to support writing striping layout files

2015-03-31 Thread Li Bo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388090#comment-14388090
 ] 

Li Bo commented on HDFS-7889:
-

hi, Zhe
{{stripedBlocks[i]}} is an instance of {{BlockingQueue}}, not {{LocatedBlock}}  
and I can not see any code that will add a non-LocatedBlock object to this 
queue. Is it necessary to check the type of element each time retrieved from 
the queue?


 Subclass DFSOutputStream to support writing striping layout files
 -

 Key: HDFS-7889
 URL: https://issues.apache.org/jira/browse/HDFS-7889
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7889-001.patch, HDFS-7889-002.patch, 
 HDFS-7889-003.patch, HDFS-7889-004.patch, HDFS-7889-005.patch


 After HDFS-7888, we can subclass  {{DFSOutputStream}} to support writing 
 striping layout files. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7954) TestBalancer#testBalancerWithPinnedBlocks failed on Windows

2015-03-31 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7954:
-
Assignee: Xiaoyu Yao
  Status: Patch Available  (was: Open)

 TestBalancer#testBalancerWithPinnedBlocks failed on Windows
 ---

 Key: HDFS-7954
 URL: https://issues.apache.org/jira/browse/HDFS-7954
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7947.00.patch


 {code}
 testBalancerWithPinnedBlocks(org.apache.hadoop.hdfs.server.balancer.TestBalancer)
   Time elapsed: 22.624 sec   FAILURE!
 java.lang.AssertionError: expected:-3 but was:0
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerWithPinnedBlocks(TestBalancer.java:353)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7954) TestBalancer#testBalancerWithPinnedBlocks failed on Windows

2015-03-31 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7954:
-
Attachment: HDFS-7947.00.patch

Post a patch to skip this test on Windows.

 TestBalancer#testBalancerWithPinnedBlocks failed on Windows
 ---

 Key: HDFS-7954
 URL: https://issues.apache.org/jira/browse/HDFS-7954
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Xiaoyu Yao
 Attachments: HDFS-7947.00.patch


 {code}
 testBalancerWithPinnedBlocks(org.apache.hadoop.hdfs.server.balancer.TestBalancer)
   Time elapsed: 22.624 sec   FAILURE!
 java.lang.AssertionError: expected:-3 but was:0
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerWithPinnedBlocks(TestBalancer.java:353)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8026) Trace FSOutputSummer#writeChecksumChunks rather than DFSOutputStream#writeChunk

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388100#comment-14388100
 ] 

Hadoop QA commented on HDFS-8026:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708293/HDFS-8026.001.patch
  against trunk revision 1a495fb.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestSetrepIncreasing
  org.apache.hadoop.tracing.TestTracing

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10122//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10122//console

This message is automatically generated.

 Trace FSOutputSummer#writeChecksumChunks rather than 
 DFSOutputStream#writeChunk
 ---

 Key: HDFS-8026
 URL: https://issues.apache.org/jira/browse/HDFS-8026
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor
 Attachments: HDFS-8026.001.patch


 We should trace FSOutputSummer#writeChecksumChunks rather than 
 DFSOutputStream#writeChunk.  When tracing writeChunk, we get a new trace span 
 every 512 bytes; when tracing writeChecksumChunks, we normally get a new 
 trace span only when the FSOutputSummer buffer is full (9x less often.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7811) Avoid recursive call getStoragePolicyID in INodeFile#computeQuotaUsage

2015-03-31 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388160#comment-14388160
 ] 

Xiaoyu Yao commented on HDFS-7811:
--

I can't find an easy way to unit tests the recursive call does not happen 
without adding test hooks in production code. 

 Avoid recursive call getStoragePolicyID in INodeFile#computeQuotaUsage
 --

 Key: HDFS-7811
 URL: https://issues.apache.org/jira/browse/HDFS-7811
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7811.00.patch, HDFS-7811.01.patch


 This is a follow up based on comment from [~jingzhao] on HDFS-7723. 
 I just noticed that INodeFile#computeQuotaUsage calls getStoragePolicyID to 
 identify the storage policy id of the file. This may not be very efficient 
 (especially when we're computing the quota usage of a directory) because 
 getStoragePolicyID may recursively check the ancestral INode's storage 
 policy. I think here an improvement can be passing the lowest parent 
 directory's storage policy down while traversing the tree. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7922) ShortCircuitCache#close is not releasing ScheduledThreadPoolExecutors

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388203#comment-14388203
 ] 

Hadoop QA commented on HDFS-7922:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708313/004-HDFS-7922.patch
  against trunk revision cce66ba.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestAuditLogs

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10123//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10123//console

This message is automatically generated.

 ShortCircuitCache#close is not releasing ScheduledThreadPoolExecutors
 -

 Key: HDFS-7922
 URL: https://issues.apache.org/jira/browse/HDFS-7922
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Rakesh R
Assignee: Rakesh R
 Attachments: 001-HDFS-7922.patch, 002-HDFS-7922.patch, 
 003-HDFS-7922.patch, 004-HDFS-7922.patch


 ShortCircuitCache has the following executors. It would be good to shutdown 
 these pools during ShortCircuitCache#close to avoid leaks.
 {code}
   /**
* The executor service that runs the cacheCleaner.
*/
   private final ScheduledThreadPoolExecutor cleanerExecutor
   = new ScheduledThreadPoolExecutor(1, new ThreadFactoryBuilder().
   setDaemon(true).setNameFormat(ShortCircuitCache_Cleaner).
   build());
   /**
* The executor service that runs the cacheCleaner.
*/
   private final ScheduledThreadPoolExecutor releaserExecutor
   = new ScheduledThreadPoolExecutor(1, new ThreadFactoryBuilder().
   setDaemon(true).setNameFormat(ShortCircuitCache_SlotReleaser).
   build());
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-5019) Cleanup imports in HDFS project

2015-03-31 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388238#comment-14388238
 ] 

Tsuyoshi Ozawa commented on HDFS-5019:
--

Hi [~djp], thank you for updating. Could you rebase the patch?

 Cleanup imports in HDFS project
 ---

 Key: HDFS-5019
 URL: https://issues.apache.org/jira/browse/HDFS-5019
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Junping Du
Assignee: Junping Du
Priority: Minor
 Attachments: HDFS-5019-v2.patch, HDFS-5019.patch


 There are some unused imported packages in current code base which cause some 
 unnecessary java warnings. Also, the sequence of imports should follow 
 alphabet and import x.x.* is not recommended.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7939) Two fsimage_rollback_* files are created which are not deleted after rollback.

2015-03-31 Thread J.Andreina (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

J.Andreina updated HDFS-7939:
-
Status: Patch Available  (was: Open)

 Two fsimage_rollback_* files are created which are not deleted after rollback.
 --

 Key: HDFS-7939
 URL: https://issues.apache.org/jira/browse/HDFS-7939
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: J.Andreina
Assignee: J.Andreina
Priority: Critical
 Attachments: HDFS-7939.1.patch


 During checkpoint , if any failure in uploading to the remote Namenode  then 
 restarting Namenode with rollingUpgrade started option creates 2 
 fsimage_rollback_* at Active Namenode .
 On rolling upgrade rollback , initially created fsimage_rollback_* file is 
 not been deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7999) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time

2015-03-31 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388273#comment-14388273
 ] 

Xinwei Qin  commented on HDFS-7999:
---

Yeah, It's a good and necessary idea to avoid holding the lock for a long time 
by the createTemporary() method.

 FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a 
 very long time
 -

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7933) fsck should also report decommissioning replicas.

2015-03-31 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7933:
-
Attachment: (was: HDFS-7933.02.patch)

 fsck should also report decommissioning replicas. 
 --

 Key: HDFS-7933
 URL: https://issues.apache.org/jira/browse/HDFS-7933
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Jitendra Nath Pandey
Assignee: Xiaoyu Yao
 Attachments: HDFS-7933.00.patch, HDFS-7933.01.patch


 Fsck doesn't count replicas that are on decommissioning nodes. If a block has 
 all replicas on the decommissioning nodes, it will be marked as missing, 
 which is alarming for the admins, although the system will replicate them 
 before nodes are decommissioned.
 Fsck output should also show decommissioning replicas along with the live 
 replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7933) fsck should also report decommissioning replicas.

2015-03-31 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7933:
-
Attachment: HDFS-7933.02.patch

Thanks [~jnp] for reviewing the patch. I've updated the patch based on your 
feedback. 

Summary of changes:

1) adding NumberReplicas#decommissioned and NumberReplicas#decommissioning to 
track the decommissioned and decommissioning replicas, respectively. 

2) deprecating NumberReplicas#decommissionedReplicas() by 
NumberReplicas#decommissionedAndDecommissioning() to avoid the misleading name.

3) Display decommissioning and decommissioned replica separately in 
NamenodeFsck#check().

 fsck should also report decommissioning replicas. 
 --

 Key: HDFS-7933
 URL: https://issues.apache.org/jira/browse/HDFS-7933
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Jitendra Nath Pandey
Assignee: Xiaoyu Yao
 Attachments: HDFS-7933.00.patch, HDFS-7933.01.patch, 
 HDFS-7933.02.patch


 Fsck doesn't count replicas that are on decommissioning nodes. If a block has 
 all replicas on the decommissioning nodes, it will be marked as missing, 
 which is alarming for the admins, although the system will replicate them 
 before nodes are decommissioned.
 Fsck output should also show decommissioning replicas along with the live 
 replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7933) fsck should also report decommissioning replicas.

2015-03-31 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7933:
-
Attachment: HDFS-7933.02.patch

 fsck should also report decommissioning replicas. 
 --

 Key: HDFS-7933
 URL: https://issues.apache.org/jira/browse/HDFS-7933
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Jitendra Nath Pandey
Assignee: Xiaoyu Yao
 Attachments: HDFS-7933.00.patch, HDFS-7933.01.patch


 Fsck doesn't count replicas that are on decommissioning nodes. If a block has 
 all replicas on the decommissioning nodes, it will be marked as missing, 
 which is alarming for the admins, although the system will replicate them 
 before nodes are decommissioned.
 Fsck output should also show decommissioning replicas along with the live 
 replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7933) fsck should also report decommissioning replicas.

2015-03-31 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7933:
-
Attachment: (was: HDFS-7933.02.patch)

 fsck should also report decommissioning replicas. 
 --

 Key: HDFS-7933
 URL: https://issues.apache.org/jira/browse/HDFS-7933
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Jitendra Nath Pandey
Assignee: Xiaoyu Yao
 Attachments: HDFS-7933.00.patch, HDFS-7933.01.patch


 Fsck doesn't count replicas that are on decommissioning nodes. If a block has 
 all replicas on the decommissioning nodes, it will be marked as missing, 
 which is alarming for the admins, although the system will replicate them 
 before nodes are decommissioned.
 Fsck output should also show decommissioning replicas along with the live 
 replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7933) fsck should also report decommissioning replicas.

2015-03-31 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-7933:
-
Attachment: HDFS-7933.02.patch

 fsck should also report decommissioning replicas. 
 --

 Key: HDFS-7933
 URL: https://issues.apache.org/jira/browse/HDFS-7933
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Jitendra Nath Pandey
Assignee: Xiaoyu Yao
 Attachments: HDFS-7933.00.patch, HDFS-7933.01.patch, 
 HDFS-7933.02.patch


 Fsck doesn't count replicas that are on decommissioning nodes. If a block has 
 all replicas on the decommissioning nodes, it will be marked as missing, 
 which is alarming for the admins, although the system will replicate them 
 before nodes are decommissioned.
 Fsck output should also show decommissioning replicas along with the live 
 replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7701) Support reporting per storage type quota and usage with hadoop/hdfs shell

2015-03-31 Thread Peter Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388235#comment-14388235
 ] 

Peter Shi commented on HDFS-7701:
-

Thanks for giving such detailed suggestion, i will upload the fixed patth ASAP.

 Support reporting per storage type quota and usage with hadoop/hdfs shell
 -

 Key: HDFS-7701
 URL: https://issues.apache.org/jira/browse/HDFS-7701
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Xiaoyu Yao
Assignee: Peter Shi
 Attachments: HDFS-7701.01.patch, HDFS-7701.02.patch, 
 HDFS-7701.03.patch


 hadoop fs -count -q or hdfs dfs -count -q currently shows name space/disk 
 space quota and remaining quota information. With HDFS-7584, we want to 
 display per storage type quota and its remaining information as well.
 The current output format as shown below may not easily accomodate 6 more 
 columns = 3 (existing storage types) * 2 (quota/remaining quota). With new 
 storage types added in future, this will make the output even more crowded. 
 There are also compatibility issues as we don't want to break any existing 
 scripts monitoring hadoop fs -count -q output. 
 $ hadoop fs -count -q -v /test
QUOTA   REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT   
 FILE_COUNT   CONTENT_SIZE PATHNAME
 none inf   524288000   5242665691 
   15  21431 /test
 Propose to add -t parameter to display ONLY the storage type quota 
 information of the directory in the separately. This way, existing scripts 
 will work as-is without using -t parameter. 
 1) When -t is not followed by a specific storage type, quota and usage 
 information for all storage types will be displayed. 
 $ hadoop fs -count -q  -t -h -v /test
SSD_QUOTA   REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA 
 ARCHIVAL_QUOTA REM_ARCHIVAL_QUOTA PATHNAME
 512MB 256MB   none inf none  
 inf/test
 2) If -t is followed by a storage type, only the quota and remaining quota of 
 the storage type is displayed. 
 $ hadoop fs -count -q  -t SSD -h -v /test
  
 SSD_QUOTA REM_SSD_QUOTA PATHNAME
 512 MB 256 MB   /test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks

2015-03-31 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-7949 started by Rakesh R.
--
 WebImageViewer need support file size calculation with striped blocks
 -

 Key: HDFS-7949
 URL: https://issues.apache.org/jira/browse/HDFS-7949
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Hui Zheng
Assignee: Rakesh R
Priority: Minor
 Attachments: HDFS-7949-001.patch


 The file size calculation should be changed when the blocks of the file are 
 striped in WebImageViewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7701) Support reporting per storage type quota and usage with hadoop/hdfs shell

2015-03-31 Thread Peter Shi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Shi updated HDFS-7701:

Attachment: HDFS-7701.04.patch

 Support reporting per storage type quota and usage with hadoop/hdfs shell
 -

 Key: HDFS-7701
 URL: https://issues.apache.org/jira/browse/HDFS-7701
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Xiaoyu Yao
Assignee: Peter Shi
 Attachments: HDFS-7701.01.patch, HDFS-7701.02.patch, 
 HDFS-7701.03.patch, HDFS-7701.04.patch


 hadoop fs -count -q or hdfs dfs -count -q currently shows name space/disk 
 space quota and remaining quota information. With HDFS-7584, we want to 
 display per storage type quota and its remaining information as well.
 The current output format as shown below may not easily accomodate 6 more 
 columns = 3 (existing storage types) * 2 (quota/remaining quota). With new 
 storage types added in future, this will make the output even more crowded. 
 There are also compatibility issues as we don't want to break any existing 
 scripts monitoring hadoop fs -count -q output. 
 $ hadoop fs -count -q -v /test
QUOTA   REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT   
 FILE_COUNT   CONTENT_SIZE PATHNAME
 none inf   524288000   5242665691 
   15  21431 /test
 Propose to add -t parameter to display ONLY the storage type quota 
 information of the directory in the separately. This way, existing scripts 
 will work as-is without using -t parameter. 
 1) When -t is not followed by a specific storage type, quota and usage 
 information for all storage types will be displayed. 
 $ hadoop fs -count -q  -t -h -v /test
SSD_QUOTA   REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA 
 ARCHIVAL_QUOTA REM_ARCHIVAL_QUOTA PATHNAME
 512MB 256MB   none inf none  
 inf/test
 2) If -t is followed by a storage type, only the quota and remaining quota of 
 the storage type is displayed. 
 $ hadoop fs -count -q  -t SSD -h -v /test
  
 SSD_QUOTA REM_SSD_QUOTA PATHNAME
 512 MB 256 MB   /test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8012) Updatable HAR Filesystem

2015-03-31 Thread Madhan Sundararajan Devaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Madhan Sundararajan Devaki updated HDFS-8012:
-
Description: 
Is there a plan to support updatable HAR Filesystem? If so, by when is this 
expected please?
The following operations may be supported.
+ Add new files
+ Remove existing files
+ Replace existing files
This is required in cases where data is stored in AVRO format in HDFS and the 
corresponding .avsc files are used to create Hive external tables.
This will lead to the small files (.avsc files in this case) problem when there 
are a large number of tables that need to be loaded into Hive as external 
tables.

  was:
Is there a plan to support updatable HAR Filesystem? If so, by when is this 
expected please?
The following operations may be supported.
+ Add new files
+ Remove existing files
+ Replace existing files


 Updatable HAR Filesystem
 

 Key: HDFS-8012
 URL: https://issues.apache.org/jira/browse/HDFS-8012
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, hdfs-client
Reporter: Madhan Sundararajan Devaki
Priority: Critical

 Is there a plan to support updatable HAR Filesystem? If so, by when is this 
 expected please?
 The following operations may be supported.
 + Add new files
 + Remove existing files
 + Replace existing files
 This is required in cases where data is stored in AVRO format in HDFS and the 
 corresponding .avsc files are used to create Hive external tables.
 This will lead to the small files (.avsc files in this case) problem when 
 there are a large number of tables that need to be loaded into Hive as 
 external tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7949) WebImageViewer need support file size calculation with striped blocks

2015-03-31 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-7949:
---
Attachment: HDFS-7949-001.patch

 WebImageViewer need support file size calculation with striped blocks
 -

 Key: HDFS-7949
 URL: https://issues.apache.org/jira/browse/HDFS-7949
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Hui Zheng
Assignee: Rakesh R
Priority: Minor
 Attachments: HDFS-7949-001.patch


 The file size calculation should be changed when the blocks of the file are 
 striped in WebImageViewer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7716) Erasure Coding: extend BlockInfo to handle EC info

2015-03-31 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-7716:

Fix Version/s: HDFS-7285

 Erasure Coding: extend BlockInfo to handle EC info
 --

 Key: HDFS-7716
 URL: https://issues.apache.org/jira/browse/HDFS-7716
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Jing Zhao
Assignee: Jing Zhao
 Fix For: HDFS-7285

 Attachments: HDFS-7716.000.patch, HDFS-7716.001.patch, 
 HDFS-7716.002.patch, HDFS-7716.003.patch


 The current BlockInfo's implementation only supports the replication 
 mechanism. To use the same blocksMap handling block group and its data/parity 
 blocks, we need to define a new BlockGroupInfo class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8012) Updatable HAR Filesystem

2015-03-31 Thread Madhan Sundararajan Devaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Madhan Sundararajan Devaki updated HDFS-8012:
-
Description: 
Is there a plan to support updatable HAR Filesystem? If so, by when is this 
expected please?
The following operations may be supported.
+ Add new files
+ Remove existing files
+ Replace existing files

  was:Is there a plan to support updatable HAR Filesystem? If so, by when is 
this expected please?


 Updatable HAR Filesystem
 

 Key: HDFS-8012
 URL: https://issues.apache.org/jira/browse/HDFS-8012
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, hdfs-client
Reporter: Madhan Sundararajan Devaki
Priority: Critical

 Is there a plan to support updatable HAR Filesystem? If so, by when is this 
 expected please?
 The following operations may be supported.
 + Add new files
 + Remove existing files
 + Replace existing files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8012) Updatable HAR Filesystem

2015-03-31 Thread Madhan Sundararajan Devaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Madhan Sundararajan Devaki updated HDFS-8012:
-
Issue Type: Improvement  (was: Bug)

 Updatable HAR Filesystem
 

 Key: HDFS-8012
 URL: https://issues.apache.org/jira/browse/HDFS-8012
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client
Reporter: Madhan Sundararajan Devaki
Priority: Critical

 Is there a plan to support updatable HAR Filesystem? If so, by when is this 
 expected please?
 The following operations may be supported.
 + Add new files
 + Remove existing files
 + Replace existing files
 This is required in cases where data is stored in AVRO format in HDFS and the 
 corresponding .avsc files are used to create Hive external tables.
 This will lead to the small files (.avsc files in this case) problem when 
 there are a large number of tables that need to be loaded into Hive as 
 external tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7652) Process block reports for erasure coded blocks

2015-03-31 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-7652:

Fix Version/s: HDFS-7285

 Process block reports for erasure coded blocks
 --

 Key: HDFS-7652
 URL: https://issues.apache.org/jira/browse/HDFS-7652
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Fix For: HDFS-7285

 Attachments: HDFS-7652.001.patch, HDFS-7652.002.patch, 
 HDFS-7652.003.patch, HDFS-7652.004.patch, HDFS-7652.005.patch, 
 HDFS-7652.006.patch


 HDFS-7339 adds support in NameNode for persisting block groups. For memory 
 efficiency, erasure coded blocks under the striping layout are not stored in 
 {{BlockManager#blocksMap}}. Instead, entire block groups are stored in 
 {{BlockGroupManager#blockGroups}}. When a block report arrives from the 
 DataNode, it should be processed under the block group that it belongs to. 
 The following naming protocol is used to calculate the group of a given block:
 {code}
  * HDFS-EC introduces a hierarchical protocol to name blocks and groups:
  * Contiguous: {reserved block IDs | flag | block ID}
  * Striped: {reserved block IDs | flag | block group ID | index in group}
  *
  * Following n bits of reserved block IDs, The (n+1)th bit in an ID
  * distinguishes contiguous (0) and striped (1) blocks. For a striped block,
  * bits (n+2) to (64-m) represent the ID of its block group, while the last m
  * bits represent its index of the group. The value m is determined by the
  * maximum number of blocks in a group (MAX_BLOCKS_IN_GROUP).
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8011) standby nn can't started

2015-03-31 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388371#comment-14388371
 ] 

Vinayakumar B commented on HDFS-8011:
-

Hi [~fujie] Can you attach little more log around above mentioned exceptions.?

 standby nn can't started
 

 Key: HDFS-8011
 URL: https://issues.apache.org/jira/browse/HDFS-8011
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.3.0
 Environment: centeros 6.2  64bit 
Reporter: fujie

 We have seen crash when starting the standby namenode, with fatal errors. Any 
 solutions, workarouds, or ideas would be helpful for us.
 1. Here is the context: 
   At begining we have 2 namenodes, take A as active and B as standby. For 
 some resons, namenode A was dead, so namenode B is working as active.
   When we try to restart A after a minute, it can't work. During this 
 time a lot of files were put to HDFS, and a lot of files were renamed. 
   Nodenode A crashed when awaiting reported blocks in safemode each 
 time.
  
 2. We can see error log below:
   1)2015-03-30  ERROR 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
 on operation CloseOp [length=0, inodeId=0, 
 path=/xxx/_temporary/xxx/part-r-00074.bz2, replication=3, 
 mtime=1427699913947, atime=1427699081161, blockSize=268435456, 
 blocks=[blk_2103131025_1100889495739], permissions=dm:dm:rw-r--r--, 
 clientName=, clientMachine=, opCode=OP_CLOSE, txid=7632753612]
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction.setGenerationStampAndVerifyReplicas(BlockInfoUnderConstruction.java:247)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction.commitBlock(BlockInfoUnderConstruction.java:267)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:639)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:813)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:383)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:209)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:122)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:737)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:227)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:321)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$0(EditLogTailer.java:302)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:356)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
 at 
 org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:413)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:292)
 
2)2015-03-30  FATAL 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unknown error 
 encountered while tailing edits. Shutting down standby N
 N.
 java.io.IOException: Failed to apply edit log operation AddBlockOp 
 [path=/xxx/_temporary/xxx/part-m-00121, 
 penultimateBlock=blk_2102331803_1100888911441, 
 lastBlock=blk_2102661068_1100889009168, RpcClientId=, RpcCallId=-2]: error
 null
 at 
 org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:215)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:122)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:737)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:227)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:321)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$0(EditLogTailer.java:302)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:356)
 

[jira] [Commented] (HDFS-7933) fsck should also report decommissioning replicas.

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388386#comment-14388386
 ] 

Hadoop QA commented on HDFS-7933:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708365/HDFS-7933.02.patch
  against trunk revision 85dc3c1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
  
org.apache.hadoop.hdfs.server.namenode.TestDefaultBlockPlacementPolicy

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10126//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10126//console

This message is automatically generated.

 fsck should also report decommissioning replicas. 
 --

 Key: HDFS-7933
 URL: https://issues.apache.org/jira/browse/HDFS-7933
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Reporter: Jitendra Nath Pandey
Assignee: Xiaoyu Yao
 Attachments: HDFS-7933.00.patch, HDFS-7933.01.patch, 
 HDFS-7933.02.patch


 Fsck doesn't count replicas that are on decommissioning nodes. If a block has 
 all replicas on the decommissioning nodes, it will be marked as missing, 
 which is alarming for the admins, although the system will replicate them 
 before nodes are decommissioned.
 Fsck output should also show decommissioning replicas along with the live 
 replicas.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7999) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time

2015-03-31 Thread Xinwei Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388309#comment-14388309
 ] 

Xinwei Qin  commented on HDFS-7999:
---

Hi [~cmccabe]
Thanks for your comment.
{quote}
even if we made the heartbeat lockless, there are still many other problems 
associated with having FsDatasetImpl#createTemporary hold the FSDatasetImpl 
lock for a very long time. Any thread that needs to read or write from the 
datanode will be blocked.
{quote}
Make the heartbeat lockless can avoid the happening of dead DataNode, and I 
think it is a necessary 
patch([https://issues.apache.org/jira/browse/HDFS-7060]). 
FSDatasetImpl lock held for a long time is another problem, May be the patch of 
this jira can alleviate the problem.

 FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a 
 very long time
 -

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8027) Erasure Coding: Update CHANGES-HDFS-7285.txt with branch commits

2015-03-31 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-8027:

Attachment: HDFS-8027-01.patch

Attaching for reference. 
Jiras are ordered as per Jira resolution date

 Erasure Coding: Update CHANGES-HDFS-7285.txt with branch commits
 

 Key: HDFS-8027
 URL: https://issues.apache.org/jira/browse/HDFS-8027
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Attachments: HDFS-8027-01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-8027) Erasure Coding: Update CHANGES-HDFS-7285.txt with branch commits

2015-03-31 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B resolved HDFS-8027.
-
   Resolution: Fixed
Fix Version/s: HDFS-7285

Committed to HDFS-7285 branch,
Committed directly as this is only CHANGES-HDFS-7285.txt update.

 Erasure Coding: Update CHANGES-HDFS-7285.txt with branch commits
 

 Key: HDFS-8027
 URL: https://issues.apache.org/jira/browse/HDFS-8027
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: HDFS-7285

 Attachments: HDFS-8027-01.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7285) Erasure Coding Support inside HDFS

2015-03-31 Thread Vinayakumar B (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388329#comment-14388329
 ] 

Vinayakumar B commented on HDFS-7285:
-

Hi,
I think most of the commits to HDFS-7285 were not added to 
CHANGES-HDFS-EC-7285.txt.
This will help to update CHANGES.txt at the time of merging to trunk, and hence 
recording the contributions.
Very happy to see many new people Contributing to this work.

For all commits till now I have updated CHANGES-HDFS-EC-7285.txt through 
HDFS-8027.
Please take care for further commits.
Thanks.

 Erasure Coding Support inside HDFS
 --

 Key: HDFS-7285
 URL: https://issues.apache.org/jira/browse/HDFS-7285
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Weihua Jiang
Assignee: Zhe Zhang
 Attachments: ECAnalyzer.py, ECParser.py, HDFS-7285-initial-PoC.patch, 
 HDFSErasureCodingDesign-20141028.pdf, HDFSErasureCodingDesign-20141217.pdf, 
 HDFSErasureCodingDesign-20150204.pdf, HDFSErasureCodingDesign-20150206.pdf, 
 fsimage-analysis-20150105.pdf


 Erasure Coding (EC) can greatly reduce the storage overhead without sacrifice 
 of data reliability, comparing to the existing HDFS 3-replica approach. For 
 example, if we use a 10+4 Reed Solomon coding, we can allow loss of 4 blocks, 
 with storage overhead only being 40%. This makes EC a quite attractive 
 alternative for big data storage, particularly for cold data. 
 Facebook had a related open source project called HDFS-RAID. It used to be 
 one of the contribute packages in HDFS but had been removed since Hadoop 2.0 
 for maintain reason. The drawbacks are: 1) it is on top of HDFS and depends 
 on MapReduce to do encoding and decoding tasks; 2) it can only be used for 
 cold files that are intended not to be appended anymore; 3) the pure Java EC 
 coding implementation is extremely slow in practical use. Due to these, it 
 might not be a good idea to just bring HDFS-RAID back.
 We (Intel and Cloudera) are working on a design to build EC into HDFS that 
 gets rid of any external dependencies, makes it self-contained and 
 independently maintained. This design lays the EC feature on the storage type 
 support and considers compatible with existing HDFS features like caching, 
 snapshot, encryption, high availability and etc. This design will also 
 support different EC coding schemes, implementations and policies for 
 different deployment scenarios. By utilizing advanced libraries (e.g. Intel 
 ISA-L library), an implementation can greatly improve the performance of EC 
 encoding/decoding and makes the EC solution even more attractive. We will 
 post the design document soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8012) Updatable HAR Filesystem

2015-03-31 Thread Madhan Sundararajan Devaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Madhan Sundararajan Devaki updated HDFS-8012:
-
Description: 
Is there a plan to support updatable HAR Filesystem? If so, by when is this 
expected please?
The following operations may be supported.
+ Add new files
+ Remove existing files
+ Replace existing files (Optional)
This is required in cases where data is stored in AVRO format in HDFS and the 
corresponding .avsc files are used to create Hive external tables.
This will lead to the small files (.avsc files in this case) problem when there 
are a large number of tables that need to be loaded into Hive as external 
tables.

  was:
Is there a plan to support updatable HAR Filesystem? If so, by when is this 
expected please?
The following operations may be supported.
+ Add new files
+ Remove existing files
+ Replace existing files
This is required in cases where data is stored in AVRO format in HDFS and the 
corresponding .avsc files are used to create Hive external tables.
This will lead to the small files (.avsc files in this case) problem when there 
are a large number of tables that need to be loaded into Hive as external 
tables.


 Updatable HAR Filesystem
 

 Key: HDFS-8012
 URL: https://issues.apache.org/jira/browse/HDFS-8012
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client
Reporter: Madhan Sundararajan Devaki
Priority: Critical

 Is there a plan to support updatable HAR Filesystem? If so, by when is this 
 expected please?
 The following operations may be supported.
 + Add new files
 + Remove existing files
 + Replace existing files (Optional)
 This is required in cases where data is stored in AVRO format in HDFS and the 
 corresponding .avsc files are used to create Hive external tables.
 This will lead to the small files (.avsc files in this case) problem when 
 there are a large number of tables that need to be loaded into Hive as 
 external tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8012) Updatable HAR Filesystem

2015-03-31 Thread Madhan Sundararajan Devaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Madhan Sundararajan Devaki updated HDFS-8012:
-
Description: 
Is there a plan to support updatable HAR Filesystem? If so, by when is this 
expected please?
The following operations may be supported.
+ Add new files [ -a filename-uri1 filename-uri2 ... / -a dirname-uri1 
dirname-uri2 ...]
+ Remove existing files [ -d filename-uri1 filename-uri2 ... / -d dirname-uri1 
dirname-uri2 ...]
+ Update/Replace existing files (Optional) [ -u old-filename-uri 
new-filename-uri]
This is required in cases where data is stored in AVRO format in HDFS and the 
corresponding .avsc files are used to create Hive external tables.
This will lead to the small files (.avsc files in this case) problem when there 
are a large number of tables that need to be loaded into Hive as external 
tables.

  was:
Is there a plan to support updatable HAR Filesystem? If so, by when is this 
expected please?
The following operations may be supported.
+ Add new files [ -a filename-uri1, filename-uri2, ...]
+ Remove existing files [ -d filename-uri1, filename-uri2, ...]
+ Update/Replace existing files (Optional) [ -u old-filename-uri 
new-filename-uri]
This is required in cases where data is stored in AVRO format in HDFS and the 
corresponding .avsc files are used to create Hive external tables.
This will lead to the small files (.avsc files in this case) problem when there 
are a large number of tables that need to be loaded into Hive as external 
tables.


 Updatable HAR Filesystem
 

 Key: HDFS-8012
 URL: https://issues.apache.org/jira/browse/HDFS-8012
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client
Reporter: Madhan Sundararajan Devaki
Priority: Critical

 Is there a plan to support updatable HAR Filesystem? If so, by when is this 
 expected please?
 The following operations may be supported.
 + Add new files [ -a filename-uri1 filename-uri2 ... / -a dirname-uri1 
 dirname-uri2 ...]
 + Remove existing files [ -d filename-uri1 filename-uri2 ... / -d 
 dirname-uri1 dirname-uri2 ...]
 + Update/Replace existing files (Optional) [ -u old-filename-uri 
 new-filename-uri]
 This is required in cases where data is stored in AVRO format in HDFS and the 
 corresponding .avsc files are used to create Hive external tables.
 This will lead to the small files (.avsc files in this case) problem when 
 there are a large number of tables that need to be loaded into Hive as 
 external tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8027) Update CHANGES-HDFS-7285.txt with branch commits

2015-03-31 Thread Vinayakumar B (JIRA)
Vinayakumar B created HDFS-8027:
---

 Summary: Update CHANGES-HDFS-7285.txt with branch commits
 Key: HDFS-8027
 URL: https://issues.apache.org/jira/browse/HDFS-8027
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Vinayakumar B
Assignee: Vinayakumar B






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8027) Erasure Coding: Update CHANGES-HDFS-7285.txt with branch commits

2015-03-31 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-8027:

Summary: Erasure Coding: Update CHANGES-HDFS-7285.txt with branch commits  
(was: Update CHANGES-HDFS-7285.txt with branch commits)

 Erasure Coding: Update CHANGES-HDFS-7285.txt with branch commits
 

 Key: HDFS-8027
 URL: https://issues.apache.org/jira/browse/HDFS-8027
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Vinayakumar B
Assignee: Vinayakumar B





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8027) Erasure Coding: Update CHANGES-HDFS-7285.txt with branch commits

2015-03-31 Thread Vinayakumar B (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinayakumar B updated HDFS-8027:

Description: Latest branch commits are not tracked in CHANGES-HDFS-7285.txt.

 Erasure Coding: Update CHANGES-HDFS-7285.txt with branch commits
 

 Key: HDFS-8027
 URL: https://issues.apache.org/jira/browse/HDFS-8027
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Vinayakumar B
Assignee: Vinayakumar B
 Fix For: HDFS-7285

 Attachments: HDFS-8027-01.patch


 Latest branch commits are not tracked in CHANGES-HDFS-7285.txt.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8011) standby nn can't started

2015-03-31 Thread fujie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388330#comment-14388330
 ] 

fujie commented on HDFS-8011:
-

HDFS-6825 affects version is 2.5.0, but our hadoop version is 2.3.0. So are you 
sure it is the same issue?

1. I am sure that the file was deleted. And I have some new findings.

Such as we have image-file-1, editlog-file-1 and editlog-file-inprogress 
when start the standby namenode A.
I found the below behavior of these files:
step-1) SNN will load image-file-1 and editlog-file-1 and generate new 
image file, take it as image-file-2.
step-2) SNN will cp image-file-2 to ative namenode.
step-3) editlog-file-inprogress will be renamed to editlog-file-2 and a new 
editlog-file-inprogress will be opened.
step-4) SNN will load editlog-file-2, at the same time datanode will report 
heartbeat to both active and standby. 

The crash happends at step-4. We print all the failed files and all of them are 
in editlog-file-2.
We alse have a statistics, 20,000 operations failed in 500,000 operations. Then 
we parsed editlog-file-2, and got the familar contents
of failed records. All of them, RPC_CLIENTID is null(blank) , and RPC_CALLID is 
-2.
RECORD
OPCODEOP_ADD_BLOCK/OPCODE
DATA
  TXID7660428426/TXID
  
PATH/workspace/dm/recommend/VideoQuality/VRII/AppList/data/interactivedata_month/_temporary/1/_temporary/attempt_1427018831005_178665_r_02_0/part
-r-2/PATH
  BLOCK
BLOCK_ID2107099231/BLOCK_ID
NUM_BYTES0/NUM_BYTES
GENSTAMP1100893452304/GENSTAMP
  /BLOCK
  RPC_CLIENTID/RPC_CLIENTID
  RPC_CALLID-2/RPC_CALLID
/DATA
  /RECORD

2. If we restart SNN A again, editlog-file-2 could be loaded correctly just 
like editlog-file-1 in last restart operation. It's weird.
Does the reported heartbeat impact its behavior? But the load process and 
report process should asynchronous, isn't it?

We are looking forward to you reply.

 standby nn can't started
 

 Key: HDFS-8011
 URL: https://issues.apache.org/jira/browse/HDFS-8011
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.3.0
 Environment: centeros 6.2  64bit 
Reporter: fujie

 We have seen crash when starting the standby namenode, with fatal errors. Any 
 solutions, workarouds, or ideas would be helpful for us.
 1. Here is the context: 
   At begining we have 2 namenodes, take A as active and B as standby. For 
 some resons, namenode A was dead, so namenode B is working as active.
   When we try to restart A after a minute, it can't work. During this 
 time a lot of files were put to HDFS, and a lot of files were renamed. 
   Nodenode A crashed when awaiting reported blocks in safemode each 
 time.
  
 2. We can see error log below:
   1)2015-03-30  ERROR 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
 on operation CloseOp [length=0, inodeId=0, 
 path=/xxx/_temporary/xxx/part-r-00074.bz2, replication=3, 
 mtime=1427699913947, atime=1427699081161, blockSize=268435456, 
 blocks=[blk_2103131025_1100889495739], permissions=dm:dm:rw-r--r--, 
 clientName=, clientMachine=, opCode=OP_CLOSE, txid=7632753612]
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction.setGenerationStampAndVerifyReplicas(BlockInfoUnderConstruction.java:247)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction.commitBlock(BlockInfoUnderConstruction.java:267)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:639)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:813)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:383)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:209)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:122)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:737)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:227)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:321)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$0(EditLogTailer.java:302)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:356)
 at 
 

[jira] [Commented] (HDFS-8011) standby nn can't started

2015-03-31 Thread fujie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388332#comment-14388332
 ] 

fujie commented on HDFS-8011:
-

HDFS-6825 affects version is 2.5.0, but our hadoop version is 2.3.0. So are you 
sure it is the same issue?

1. I am sure that the file was deleted. And I have some new findings.

Such as we have image-file-1, editlog-file-1 and editlog-file-inprogress 
when start the standby namenode A.
I found the below behavior of these files:
step-1) SNN will load image-file-1 and editlog-file-1 and generate new 
image file, take it as image-file-2.
step-2) SNN will cp image-file-2 to ative namenode.
step-3) editlog-file-inprogress will be renamed to editlog-file-2 and a new 
editlog-file-inprogress will be opened.
step-4) SNN will load editlog-file-2, at the same time datanode will report 
heartbeat to both active and standby. 

The crash happends at step-4. We print all the failed files and all of them are 
in editlog-file-2.
We alse have a statistics, 20,000 operations failed in 500,000 operations. Then 
we parsed editlog-file-2, and got the familar contents
of failed records. All of them, RPC_CLIENTID is null(blank) , and RPC_CALLID is 
-2.
RECORD
OPCODEOP_ADD_BLOCK/OPCODE
DATA
  TXID7660428426/TXID
  
PATH/workspace/dm/recommend/VideoQuality/VRII/AppList/data/interactivedata_month/_temporary/1/_temporary/attempt_1427018831005_178665_r_02_0/part
-r-2/PATH
  BLOCK
BLOCK_ID2107099231/BLOCK_ID
NUM_BYTES0/NUM_BYTES
GENSTAMP1100893452304/GENSTAMP
  /BLOCK
  RPC_CLIENTID/RPC_CLIENTID
  RPC_CALLID-2/RPC_CALLID
/DATA
  /RECORD

2. If we restart SNN A again, editlog-file-2 could be loaded correctly just 
like editlog-file-1 in last restart operation. It's weird.
Does the reported heartbeat impact its behavior? But the load process and 
report process should asynchronous, isn't it?

We are looking forward to you reply.

 standby nn can't started
 

 Key: HDFS-8011
 URL: https://issues.apache.org/jira/browse/HDFS-8011
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.3.0
 Environment: centeros 6.2  64bit 
Reporter: fujie

 We have seen crash when starting the standby namenode, with fatal errors. Any 
 solutions, workarouds, or ideas would be helpful for us.
 1. Here is the context: 
   At begining we have 2 namenodes, take A as active and B as standby. For 
 some resons, namenode A was dead, so namenode B is working as active.
   When we try to restart A after a minute, it can't work. During this 
 time a lot of files were put to HDFS, and a lot of files were renamed. 
   Nodenode A crashed when awaiting reported blocks in safemode each 
 time.
  
 2. We can see error log below:
   1)2015-03-30  ERROR 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
 on operation CloseOp [length=0, inodeId=0, 
 path=/xxx/_temporary/xxx/part-r-00074.bz2, replication=3, 
 mtime=1427699913947, atime=1427699081161, blockSize=268435456, 
 blocks=[blk_2103131025_1100889495739], permissions=dm:dm:rw-r--r--, 
 clientName=, clientMachine=, opCode=OP_CLOSE, txid=7632753612]
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction.setGenerationStampAndVerifyReplicas(BlockInfoUnderConstruction.java:247)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction.commitBlock(BlockInfoUnderConstruction.java:267)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:639)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:813)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:383)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:209)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:122)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:737)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:227)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:321)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$0(EditLogTailer.java:302)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:356)
 at 
 

[jira] [Commented] (HDFS-7701) Support reporting per storage type quota and usage with hadoop/hdfs shell

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388349#comment-14388349
 ] 

Hadoop QA commented on HDFS-7701:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708388/HDFS-7701.04.patch
  against trunk revision b5a22e9.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10128//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10128//console

This message is automatically generated.

 Support reporting per storage type quota and usage with hadoop/hdfs shell
 -

 Key: HDFS-7701
 URL: https://issues.apache.org/jira/browse/HDFS-7701
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode, namenode
Reporter: Xiaoyu Yao
Assignee: Peter Shi
 Attachments: HDFS-7701.01.patch, HDFS-7701.02.patch, 
 HDFS-7701.03.patch, HDFS-7701.04.patch


 hadoop fs -count -q or hdfs dfs -count -q currently shows name space/disk 
 space quota and remaining quota information. With HDFS-7584, we want to 
 display per storage type quota and its remaining information as well.
 The current output format as shown below may not easily accomodate 6 more 
 columns = 3 (existing storage types) * 2 (quota/remaining quota). With new 
 storage types added in future, this will make the output even more crowded. 
 There are also compatibility issues as we don't want to break any existing 
 scripts monitoring hadoop fs -count -q output. 
 $ hadoop fs -count -q -v /test
QUOTA   REM_QUOTA SPACE_QUOTA REM_SPACE_QUOTADIR_COUNT   
 FILE_COUNT   CONTENT_SIZE PATHNAME
 none inf   524288000   5242665691 
   15  21431 /test
 Propose to add -t parameter to display ONLY the storage type quota 
 information of the directory in the separately. This way, existing scripts 
 will work as-is without using -t parameter. 
 1) When -t is not followed by a specific storage type, quota and usage 
 information for all storage types will be displayed. 
 $ hadoop fs -count -q  -t -h -v /test
SSD_QUOTA   REM_SSD_QUOTA DISK_QUOTA REM_DISK_QUOTA 
 ARCHIVAL_QUOTA REM_ARCHIVAL_QUOTA PATHNAME
 512MB 256MB   none inf none  
 inf/test
 2) If -t is followed by a storage type, only the quota and remaining quota of 
 the storage type is displayed. 
 $ hadoop fs -count -q  -t SSD -h -v /test
  
 SSD_QUOTA REM_SSD_QUOTA PATHNAME
 512 MB 256 MB   /test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8012) Updatable HAR Filesystem

2015-03-31 Thread Madhan Sundararajan Devaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Madhan Sundararajan Devaki updated HDFS-8012:
-
Description: 
Is there a plan to support updatable HAR Filesystem? If so, by when is this 
expected please?
The following operations may be supported.
+ Add new files [ -a filename-uri1, filename-uri2, ...]
+ Remove existing files [ -d filename-uri1, filename-uri2, ...]
+ Update/Replace existing files (Optional) [ -u old-filename-uri 
new-filename-uri]
This is required in cases where data is stored in AVRO format in HDFS and the 
corresponding .avsc files are used to create Hive external tables.
This will lead to the small files (.avsc files in this case) problem when there 
are a large number of tables that need to be loaded into Hive as external 
tables.

  was:
Is there a plan to support updatable HAR Filesystem? If so, by when is this 
expected please?
The following operations may be supported.
+ Add new files
+ Remove existing files
+ Replace existing files (Optional)
This is required in cases where data is stored in AVRO format in HDFS and the 
corresponding .avsc files are used to create Hive external tables.
This will lead to the small files (.avsc files in this case) problem when there 
are a large number of tables that need to be loaded into Hive as external 
tables.


 Updatable HAR Filesystem
 

 Key: HDFS-8012
 URL: https://issues.apache.org/jira/browse/HDFS-8012
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, hdfs-client
Reporter: Madhan Sundararajan Devaki
Priority: Critical

 Is there a plan to support updatable HAR Filesystem? If so, by when is this 
 expected please?
 The following operations may be supported.
 + Add new files [ -a filename-uri1, filename-uri2, ...]
 + Remove existing files [ -d filename-uri1, filename-uri2, ...]
 + Update/Replace existing files (Optional) [ -u old-filename-uri 
 new-filename-uri]
 This is required in cases where data is stored in AVRO format in HDFS and the 
 corresponding .avsc files are used to create Hive external tables.
 This will lead to the small files (.avsc files in this case) problem when 
 there are a large number of tables that need to be loaded into Hive as 
 external tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7888) Change DataStreamer/DFSOutputStream/DFSPacket for convenience of subclassing

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388304#comment-14388304
 ] 

Hadoop QA commented on HDFS-7888:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12708333/HDFS-7888-trunk-001.patch
  against trunk revision 85dc3c1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10124//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10124//console

This message is automatically generated.

 Change DataStreamer/DFSOutputStream/DFSPacket for convenience of subclassing
 

 Key: HDFS-7888
 URL: https://issues.apache.org/jira/browse/HDFS-7888
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7888-001.patch, HDFS-7888-trunk-001.patch


 HDFS-7793 refactors class {{DFSOutputStream}} on trunk which makes 
 {{DFSOutputStream}} a class without any inner classes. We want to subclass 
 {{DFSOutputStream}} to support striping layout writing. This JIRA depends 
 upon HDFS-7793 and tries to change DataStreamer/DFSOutputStream/DFSPacket for 
 convenience of subclassing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7954) TestBalancer#testBalancerWithPinnedBlocks failed on Windows

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388335#comment-14388335
 ] 

Hadoop QA commented on HDFS-7954:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708345/HDFS-7947.00.patch
  against trunk revision 85dc3c1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10125//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10125//console

This message is automatically generated.

 TestBalancer#testBalancerWithPinnedBlocks failed on Windows
 ---

 Key: HDFS-7954
 URL: https://issues.apache.org/jira/browse/HDFS-7954
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-7947.00.patch


 {code}
 testBalancerWithPinnedBlocks(org.apache.hadoop.hdfs.server.balancer.TestBalancer)
   Time elapsed: 22.624 sec   FAILURE!
 java.lang.AssertionError: expected:-3 but was:0
   at org.junit.Assert.fail(Assert.java:88)
   at org.junit.Assert.failNotEquals(Assert.java:743)
   at org.junit.Assert.assertEquals(Assert.java:118)
   at org.junit.Assert.assertEquals(Assert.java:555)
   at org.junit.Assert.assertEquals(Assert.java:542)
   at 
 org.apache.hadoop.hdfs.server.balancer.TestBalancer.testBalancerWithPinnedBlocks(TestBalancer.java:353)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7937) Erasure Coding: INodeFile quota computation unit tests

2015-03-31 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388482#comment-14388482
 ] 

Rakesh R commented on HDFS-7937:


Thanks [~kaisasak], good number of unit test cases. I've few minor comments, 
please see it.
# TestINodeFile#testBlockStripedTotalBlockCount, Do we need the below logic in 
this testcase ?
{code}
+INodeFile inf = createINodeFile(HdfsConstants.EC_STORAGE_POLICY_ID);
+inf.addStripedBlocksFeature();
{code}
# Could you please do the assertion by reversing the {{actual}} and 
{{expected}} arguments. I could see this kind of usage in many places, please 
modify all such cases. For example,
case-1) 
{code}
assertEquals(inf.getBlocks().length, 1);

can be written as :

assertEquals(1, inf.getBlocks().length);
{code}
Case-2) 
{code}
assertEquals(blockInfoStriped.getTotalBlockNum(), 9);

can be written as :

assertEquals(9, blockInfoStriped.getTotalBlockNum());
{code}

 Erasure Coding: INodeFile quota computation unit tests
 --

 Key: HDFS-7937
 URL: https://issues.apache.org/jira/browse/HDFS-7937
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Sasaki
Assignee: Kai Sasaki
Priority: Minor
 Attachments: HDFS-7937.1.patch, HDFS-7937.2.patch


 Unit test for [HDFS-7826|https://issues.apache.org/jira/browse/HDFS-7826]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7937) Erasure Coding: INodeFile quota computation unit tests

2015-03-31 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated HDFS-7937:
---
Status: Open  (was: Patch Available)

 Erasure Coding: INodeFile quota computation unit tests
 --

 Key: HDFS-7937
 URL: https://issues.apache.org/jira/browse/HDFS-7937
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Sasaki
Assignee: Kai Sasaki
Priority: Minor
 Attachments: HDFS-7937.1.patch, HDFS-7937.2.patch


 Unit test for [HDFS-7826|https://issues.apache.org/jira/browse/HDFS-7826]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8002) Website refers to /trash directory

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388414#comment-14388414
 ] 

Hudson commented on HDFS-8002:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #149 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/149/])
HDFS-8002. Website refers to /trash directory. Contributd by Brahma Reddy 
Battula. (aajisaka: rev e7ea2a8e8f0a7b428ef10552885757b99b59e4dc)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Website refers to /trash directory
 --

 Key: HDFS-8002
 URL: https://issues.apache.org/jira/browse/HDFS-8002
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Reporter: Mike Drob
Assignee: Brahma Reddy Battula
 Fix For: 2.8.0

 Attachments: HDFS-8002.patch, HDFS-8003-002.patch


 On 
 http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#File_Deletes_and_Undeletes
  the section on trash refers to files residing in {{/trash}}.
 I think this is an error, as files actually go to user specific trash 
 directories like {{/user/hdfs/.Trash}}
 Either the site needs to be updated to mention user specific directories, or 
 if this is a change from previous behaviour then maybe that can be mentioned 
 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3918) EditLogTailer shouldn't log WARN when other node is in standby mode

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388415#comment-14388415
 ] 

Hudson commented on HDFS-3918:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #149 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/149/])
HDFS-3918. EditLogTailer shouldn't log WARN when other node is in standby mode. 
Contributed by Todd Lipcon. (harsh: rev 
cce66ba3c9ec293e8ba1afd0eb518c7ca0bbc7c9)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java


 EditLogTailer shouldn't log WARN when other node is in standby mode
 ---

 Key: HDFS-3918
 URL: https://issues.apache.org/jira/browse/HDFS-3918
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.0.3-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 2.8.0

 Attachments: hdfs-3918.txt


 If both nodes are in standby mode, each will be trying to roll the others' 
 logs, which results in errors like:
 Unable to trigger a roll of the active NN 
 org.apache.hadoop.ipc.StandbyException: Operation category JOURNAL is not 
 supported in state standby
 We should catch this specific exception and not log it at WARN level, since 
 it's expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7261) storageMap is accessed without synchronization in DatanodeDescriptor#updateHeartbeatState()

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388413#comment-14388413
 ] 

Hudson commented on HDFS-7261:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #149 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/149/])
HDFS-7261. storageMap is accessed without synchronization in 
DatanodeDescriptor#updateHeartbeatState() (Brahma Reddy Battula via Colin P. 
McCabe) (cmccabe: rev 1feb9569f366a29ecb43592d71ee21023162c18f)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 storageMap is accessed without synchronization in 
 DatanodeDescriptor#updateHeartbeatState()
 ---

 Key: HDFS-7261
 URL: https://issues.apache.org/jira/browse/HDFS-7261
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Brahma Reddy Battula
 Fix For: 2.8.0

 Attachments: HDFS-7261-001.patch, HDFS-7261-002.patch, HDFS-7261.patch


 Here is the code:
 {code}
   failedStorageInfos = new HashSetDatanodeStorageInfo(
   storageMap.values());
 {code}
 In other places, the lock on DatanodeDescriptor.storageMap is held:
 {code}
 synchronized (storageMap) {
   final CollectionDatanodeStorageInfo storages = storageMap.values();
   return storages.toArray(new DatanodeStorageInfo[storages.size()]);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7944) Minor cleanup of BlockPoolManager#getAllNamenodeThreads

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388422#comment-14388422
 ] 

Hudson commented on HDFS-7944:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #149 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/149/])
HDFS-7944. Minor cleanup of BlockPoolManager#getAllNamenodeThreads. (Arpit 
Agarwal) (arp: rev 85dc3c14b2ca4b01a93361bb925c39a22a6fd8db)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestTriggerBlockReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDatanodeProtocolRetryPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeExit.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockScanner.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestIncrementalBlockReports.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestRefreshNamenodes.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMultipleRegistrations.java


 Minor cleanup of BlockPoolManager#getAllNamenodeThreads
 ---

 Key: HDFS-7944
 URL: https://issues.apache.org/jira/browse/HDFS-7944
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Priority: Minor
 Fix For: 2.8.0

 Attachments: HDFS-7944.01.patch, HDFS-7944.02.patch


 {{BlockPoolManager#getAllNamenodeThreads}} can avoid unnecessary list to 
 array conversion and vice versa by returning an unmodifiable list.
 Since NN addition/removal is relatively rare we can just use a 
 {{CopyOnWriteArrayList}} for concurrency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8028) TestNNHandlesBlockReportPerStorage/TestNNHandlesCombinedBlockReport Failed after patched HDFS-7704

2015-03-31 Thread hongyu bi (JIRA)
hongyu bi created HDFS-8028:
---

 Summary: 
TestNNHandlesBlockReportPerStorage/TestNNHandlesCombinedBlockReport Failed 
after patched HDFS-7704
 Key: HDFS-8028
 URL: https://issues.apache.org/jira/browse/HDFS-8028
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.0
Reporter: hongyu bi
Assignee: hongyu bi
Priority: Minor


HDFS-7704 makes BadBlockReport asynchronously however 
BlockReportTestBase#blockreport_02 doesn't wait for a while after blockreport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8028) TestNNHandlesBlockReportPerStorage/TestNNHandlesCombinedBlockReport Failed after patched HDFS-7704

2015-03-31 Thread hongyu bi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hongyu bi updated HDFS-8028:

Attachment: HDFS-8028-v0.patch

 TestNNHandlesBlockReportPerStorage/TestNNHandlesCombinedBlockReport Failed 
 after patched HDFS-7704
 --

 Key: HDFS-8028
 URL: https://issues.apache.org/jira/browse/HDFS-8028
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.0
Reporter: hongyu bi
Assignee: hongyu bi
Priority: Minor
 Attachments: HDFS-8028-v0.patch


 HDFS-7704 makes BadBlockReport asynchronously however 
 BlockReportTestBase#blockreport_02 doesn't wait for a while after blockreport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8028) TestNNHandlesBlockReportPerStorage/TestNNHandlesCombinedBlockReport Failed after patched HDFS-7704

2015-03-31 Thread hongyu bi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hongyu bi updated HDFS-8028:

Attachment: HDFS-8028-v0.patch

 TestNNHandlesBlockReportPerStorage/TestNNHandlesCombinedBlockReport Failed 
 after patched HDFS-7704
 --

 Key: HDFS-8028
 URL: https://issues.apache.org/jira/browse/HDFS-8028
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.0
Reporter: hongyu bi
Assignee: hongyu bi
Priority: Minor
 Attachments: HDFS-8028-v0.patch


 HDFS-7704 makes BadBlockReport asynchronously however 
 BlockReportTestBase#blockreport_02 doesn't wait for a while after blockreport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7645) Rolling upgrade is restoring blocks from trash multiple times

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388416#comment-14388416
 ] 

Hudson commented on HDFS-7645:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #149 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/149/])
HDFS-7645. Rolling upgrade is restoring blocks from trash multiple times 
(Contributed by Vinayakumar B and Keisuke Ogiwara) (arp: rev 
1a495fbb489c9e9a23b341a52696d10e9e272b04)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/RollingUpgradeStatus.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeRollingUpgrade.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/RollingUpgradeInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java


 Rolling upgrade is restoring blocks from trash multiple times
 -

 Key: HDFS-7645
 URL: https://issues.apache.org/jira/browse/HDFS-7645
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.6.0
Reporter: Nathan Roberts
Assignee: Keisuke Ogiwara
 Fix For: 2.8.0

 Attachments: HDFS-7645.01.patch, HDFS-7645.02.patch, 
 HDFS-7645.03.patch, HDFS-7645.04.patch, HDFS-7645.05.patch, 
 HDFS-7645.06.patch, HDFS-7645.07.patch


 When performing an HDFS rolling upgrade, the trash directory is getting 
 restored twice when under normal circumstances it shouldn't need to be 
 restored at all. iiuc, the only time these blocks should be restored is if we 
 need to rollback a rolling upgrade. 
 On a busy cluster, this can cause significant and unnecessary block churn 
 both on the datanodes, and more importantly in the namenode.
 The two times this happens are:
 1) restart of DN onto new software
 {code}
   private void doTransition(DataNode datanode, StorageDirectory sd,
   NamespaceInfo nsInfo, StartupOption startOpt) throws IOException {
 if (startOpt == StartupOption.ROLLBACK  sd.getPreviousDir().exists()) {
   Preconditions.checkState(!getTrashRootDir(sd).exists(),
   sd.getPreviousDir() +  and  + getTrashRootDir(sd) +  should not 
  +
both be present.);
   doRollback(sd, nsInfo); // rollback if applicable
 } else {
   // Restore all the files in the trash. The restored files are retained
   // during rolling upgrade rollback. They are deleted during rolling
   // upgrade downgrade.
   int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd));
   LOG.info(Restored  + restored +  block files from trash.);
 }
 {code}
 2) When heartbeat response no longer indicates a rollingupgrade is in progress
 {code}
   /**
* Signal the current rolling upgrade status as indicated by the NN.
* @param inProgress true if a rolling upgrade is in progress
*/
   void signalRollingUpgrade(boolean inProgress) throws IOException {
 String bpid = getBlockPoolId();
 if (inProgress) {
   dn.getFSDataset().enableTrash(bpid);
   dn.getFSDataset().setRollingUpgradeMarker(bpid);
 } else {
   dn.getFSDataset().restoreTrash(bpid);
   dn.getFSDataset().clearRollingUpgradeMarker(bpid);
 }
   }
 {code}
 HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely 
 clear whether this is somehow intentional. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7742) favoring decommissioning node for replication can cause a block to stay underreplicated for long periods

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388412#comment-14388412
 ] 

Hudson commented on HDFS-7742:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #149 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/149/])
HDFS-7742. Favoring decommissioning node for replication can cause a block to 
stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java


 favoring decommissioning node for replication can cause a block to stay 
 underreplicated for long periods
 

 Key: HDFS-7742
 URL: https://issues.apache.org/jira/browse/HDFS-7742
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: Nathan Roberts
Assignee: Nathan Roberts
 Fix For: 2.7.0

 Attachments: HDFS-7742-v0.patch


 When choosing a source node to replicate a block from, a decommissioning node 
 is favored. The reason for the favoritism is that decommissioning nodes 
 aren't servicing any writes so in-theory they are less loaded.
 However, the same selection algorithm also tries to make sure it doesn't get 
 stuck on any particular node:
 {noformat}
   // switch to a different node randomly
   // this to prevent from deterministically selecting the same node even
   // if the node failed to replicate the block on previous iterations
 {noformat}
 Unfortunately, the decommissioning check is prior to this randomness so the 
 algorithm can get stuck trying to replicate from a decommissioning node. 
 We've seen this in practice where a decommissioning datanode was failing to 
 replicate a block for many days, when other viable replicas of the block were 
 available.
 Given that we limit the number of streams we'll assign to a given node 
 (default soft limit of 2, hard limit of 4), It doesn't seem like favoring a 
 decommissioning node has significant benefit. i.e. when there is significant 
 replication work to do, we'll quickly hit the stream limit of the 
 decommissioning nodes and use other nodes in the cluster anyway; when there 
 isn't significant replication work then in theory we've got plenty of 
 replication bandwidth available so choosing a decommissioning node isn't much 
 of a win.
 I see two choices:
 1) Change the algorithm to still favor decommissioning nodes but with some 
 level of randomness that will avoid always selecting the decommissioning node
 2) Remove the favoritism for decommissioning nodes
 I prefer #2. It simplifies the algorithm, and given the other throttles we 
 have in place, I'm not sure there is a significant benefit to selecting 
 decommissioning nodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7748) Separate ECN flags from the Status in the DataTransferPipelineAck

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388418#comment-14388418
 ] 

Hudson commented on HDFS-7748:
--

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #149 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/149/])
HDFS-7748. Separate ECN flags from the Status in the DataTransferPipelineAck. 
Contributed by Anu Engineer and Haohui Mai. (wheat9: rev 
b80457158daf0dc712fbe5695625cc17d70d4bb4)
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
Addendum for HDFS-7748. (wheat9: rev 0967b1d99d7001cd1d09ebd29b9360f1079410e8)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java


 Separate ECN flags from the Status in the DataTransferPipelineAck
 -

 Key: HDFS-7748
 URL: https://issues.apache.org/jira/browse/HDFS-7748
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Anu Engineer
Priority: Blocker
 Attachments: HDFS-7748.007-addendum.patch, HDFS-7748.007.patch, 
 hdfs-7748.001.patch, hdfs-7748.002.patch, hdfs-7748.003.patch, 
 hdfs-7748.004.patch, hdfs-7748.005.patch, hdfs-7748.006.patch, 
 hdfs-7748.branch-2.7.006.patch


 Prior to the discussions on HDFS-7270, the old clients might fail to talk to 
 the newer server when ECN is turned on. This jira proposes to separate the 
 ECN flags in a separate protobuf field to make the ack compatible on both 
 versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8028) TestNNHandlesBlockReportPerStorage/TestNNHandlesCombinedBlockReport Failed after patched HDFS-7704

2015-03-31 Thread hongyu bi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hongyu bi updated HDFS-8028:

Attachment: (was: HDFS-8028-v0.patch)

 TestNNHandlesBlockReportPerStorage/TestNNHandlesCombinedBlockReport Failed 
 after patched HDFS-7704
 --

 Key: HDFS-8028
 URL: https://issues.apache.org/jira/browse/HDFS-8028
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.7.0
Reporter: hongyu bi
Assignee: hongyu bi
Priority: Minor
 Attachments: HDFS-8028-v0.patch


 HDFS-7704 makes BadBlockReport asynchronously however 
 BlockReportTestBase#blockreport_02 doesn't wait for a while after blockreport.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7261) storageMap is accessed without synchronization in DatanodeDescriptor#updateHeartbeatState()

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388433#comment-14388433
 ] 

Hudson commented on HDFS-7261:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #883 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/883/])
HDFS-7261. storageMap is accessed without synchronization in 
DatanodeDescriptor#updateHeartbeatState() (Brahma Reddy Battula via Colin P. 
McCabe) (cmccabe: rev 1feb9569f366a29ecb43592d71ee21023162c18f)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java


 storageMap is accessed without synchronization in 
 DatanodeDescriptor#updateHeartbeatState()
 ---

 Key: HDFS-7261
 URL: https://issues.apache.org/jira/browse/HDFS-7261
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Brahma Reddy Battula
 Fix For: 2.8.0

 Attachments: HDFS-7261-001.patch, HDFS-7261-002.patch, HDFS-7261.patch


 Here is the code:
 {code}
   failedStorageInfos = new HashSetDatanodeStorageInfo(
   storageMap.values());
 {code}
 In other places, the lock on DatanodeDescriptor.storageMap is held:
 {code}
 synchronized (storageMap) {
   final CollectionDatanodeStorageInfo storages = storageMap.values();
   return storages.toArray(new DatanodeStorageInfo[storages.size()]);
 }
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7742) favoring decommissioning node for replication can cause a block to stay underreplicated for long periods

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388432#comment-14388432
 ] 

Hudson commented on HDFS-7742:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #883 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/883/])
HDFS-7742. Favoring decommissioning node for replication can cause a block to 
stay (kihwal: rev 04ee18ed48ceef34598f954ff40940abc9fde1d2)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 favoring decommissioning node for replication can cause a block to stay 
 underreplicated for long periods
 

 Key: HDFS-7742
 URL: https://issues.apache.org/jira/browse/HDFS-7742
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.6.0
Reporter: Nathan Roberts
Assignee: Nathan Roberts
 Fix For: 2.7.0

 Attachments: HDFS-7742-v0.patch


 When choosing a source node to replicate a block from, a decommissioning node 
 is favored. The reason for the favoritism is that decommissioning nodes 
 aren't servicing any writes so in-theory they are less loaded.
 However, the same selection algorithm also tries to make sure it doesn't get 
 stuck on any particular node:
 {noformat}
   // switch to a different node randomly
   // this to prevent from deterministically selecting the same node even
   // if the node failed to replicate the block on previous iterations
 {noformat}
 Unfortunately, the decommissioning check is prior to this randomness so the 
 algorithm can get stuck trying to replicate from a decommissioning node. 
 We've seen this in practice where a decommissioning datanode was failing to 
 replicate a block for many days, when other viable replicas of the block were 
 available.
 Given that we limit the number of streams we'll assign to a given node 
 (default soft limit of 2, hard limit of 4), It doesn't seem like favoring a 
 decommissioning node has significant benefit. i.e. when there is significant 
 replication work to do, we'll quickly hit the stream limit of the 
 decommissioning nodes and use other nodes in the cluster anyway; when there 
 isn't significant replication work then in theory we've got plenty of 
 replication bandwidth available so choosing a decommissioning node isn't much 
 of a win.
 I see two choices:
 1) Change the algorithm to still favor decommissioning nodes but with some 
 level of randomness that will avoid always selecting the decommissioning node
 2) Remove the favoritism for decommissioning nodes
 I prefer #2. It simplifies the algorithm, and given the other throttles we 
 have in place, I'm not sure there is a significant benefit to selecting 
 decommissioning nodes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-3918) EditLogTailer shouldn't log WARN when other node is in standby mode

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388435#comment-14388435
 ] 

Hudson commented on HDFS-3918:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #883 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/883/])
HDFS-3918. EditLogTailer shouldn't log WARN when other node is in standby mode. 
Contributed by Todd Lipcon. (harsh: rev 
cce66ba3c9ec293e8ba1afd0eb518c7ca0bbc7c9)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 EditLogTailer shouldn't log WARN when other node is in standby mode
 ---

 Key: HDFS-3918
 URL: https://issues.apache.org/jira/browse/HDFS-3918
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 2.0.3-alpha
Reporter: Todd Lipcon
Assignee: Todd Lipcon
 Fix For: 2.8.0

 Attachments: hdfs-3918.txt


 If both nodes are in standby mode, each will be trying to roll the others' 
 logs, which results in errors like:
 Unable to trigger a roll of the active NN 
 org.apache.hadoop.ipc.StandbyException: Operation category JOURNAL is not 
 supported in state standby
 We should catch this specific exception and not log it at WARN level, since 
 it's expected behavior.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8002) Website refers to /trash directory

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388434#comment-14388434
 ] 

Hudson commented on HDFS-8002:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #883 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/883/])
HDFS-8002. Website refers to /trash directory. Contributd by Brahma Reddy 
Battula. (aajisaka: rev e7ea2a8e8f0a7b428ef10552885757b99b59e4dc)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Website refers to /trash directory
 --

 Key: HDFS-8002
 URL: https://issues.apache.org/jira/browse/HDFS-8002
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Reporter: Mike Drob
Assignee: Brahma Reddy Battula
 Fix For: 2.8.0

 Attachments: HDFS-8002.patch, HDFS-8003-002.patch


 On 
 http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#File_Deletes_and_Undeletes
  the section on trash refers to files residing in {{/trash}}.
 I think this is an error, as files actually go to user specific trash 
 directories like {{/user/hdfs/.Trash}}
 Either the site needs to be updated to mention user specific directories, or 
 if this is a change from previous behaviour then maybe that can be mentioned 
 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7645) Rolling upgrade is restoring blocks from trash multiple times

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388436#comment-14388436
 ] 

Hudson commented on HDFS-7645:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #883 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/883/])
HDFS-7645. Rolling upgrade is restoring blocks from trash multiple times 
(Contributed by Vinayakumar B and Keisuke Ogiwara) (arp: rev 
1a495fbb489c9e9a23b341a52696d10e9e272b04)
* hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/impl/FsDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/RollingUpgradeInfo.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/RollingUpgradeStatus.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/extdataset/ExternalDatasetImpl.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeRollingUpgrade.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocolPB/PBHelper.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/hdfs.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/SimulatedFSDataset.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/fsdataset/FsDatasetSpi.java


 Rolling upgrade is restoring blocks from trash multiple times
 -

 Key: HDFS-7645
 URL: https://issues.apache.org/jira/browse/HDFS-7645
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.6.0
Reporter: Nathan Roberts
Assignee: Keisuke Ogiwara
 Fix For: 2.8.0

 Attachments: HDFS-7645.01.patch, HDFS-7645.02.patch, 
 HDFS-7645.03.patch, HDFS-7645.04.patch, HDFS-7645.05.patch, 
 HDFS-7645.06.patch, HDFS-7645.07.patch


 When performing an HDFS rolling upgrade, the trash directory is getting 
 restored twice when under normal circumstances it shouldn't need to be 
 restored at all. iiuc, the only time these blocks should be restored is if we 
 need to rollback a rolling upgrade. 
 On a busy cluster, this can cause significant and unnecessary block churn 
 both on the datanodes, and more importantly in the namenode.
 The two times this happens are:
 1) restart of DN onto new software
 {code}
   private void doTransition(DataNode datanode, StorageDirectory sd,
   NamespaceInfo nsInfo, StartupOption startOpt) throws IOException {
 if (startOpt == StartupOption.ROLLBACK  sd.getPreviousDir().exists()) {
   Preconditions.checkState(!getTrashRootDir(sd).exists(),
   sd.getPreviousDir() +  and  + getTrashRootDir(sd) +  should not 
  +
both be present.);
   doRollback(sd, nsInfo); // rollback if applicable
 } else {
   // Restore all the files in the trash. The restored files are retained
   // during rolling upgrade rollback. They are deleted during rolling
   // upgrade downgrade.
   int restored = restoreBlockFilesFromTrash(getTrashRootDir(sd));
   LOG.info(Restored  + restored +  block files from trash.);
 }
 {code}
 2) When heartbeat response no longer indicates a rollingupgrade is in progress
 {code}
   /**
* Signal the current rolling upgrade status as indicated by the NN.
* @param inProgress true if a rolling upgrade is in progress
*/
   void signalRollingUpgrade(boolean inProgress) throws IOException {
 String bpid = getBlockPoolId();
 if (inProgress) {
   dn.getFSDataset().enableTrash(bpid);
   dn.getFSDataset().setRollingUpgradeMarker(bpid);
 } else {
   dn.getFSDataset().restoreTrash(bpid);
   dn.getFSDataset().clearRollingUpgradeMarker(bpid);
 }
   }
 {code}
 HDFS-6800 and HDFS-6981 were modifying this behavior making it not completely 
 clear whether this is somehow intentional. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7944) Minor cleanup of BlockPoolManager#getAllNamenodeThreads

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388442#comment-14388442
 ] 

Hudson commented on HDFS-7944:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #883 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/883/])
HDFS-7944. Minor cleanup of BlockPoolManager#getAllNamenodeThreads. (Arpit 
Agarwal) (arp: rev 85dc3c14b2ca4b01a93361bb925c39a22a6fd8db)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMultipleRegistrations.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestRefreshNamenodes.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeExit.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestIncrementalBlockReports.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDatanodeProtocolRetryPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestTriggerBlockReport.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockScanner.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolManager.java


 Minor cleanup of BlockPoolManager#getAllNamenodeThreads
 ---

 Key: HDFS-7944
 URL: https://issues.apache.org/jira/browse/HDFS-7944
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Priority: Minor
 Fix For: 2.8.0

 Attachments: HDFS-7944.01.patch, HDFS-7944.02.patch


 {{BlockPoolManager#getAllNamenodeThreads}} can avoid unnecessary list to 
 array conversion and vice versa by returning an unmodifiable list.
 Since NN addition/removal is relatively rare we can just use a 
 {{CopyOnWriteArrayList}} for concurrency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7748) Separate ECN flags from the Status in the DataTransferPipelineAck

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388438#comment-14388438
 ] 

Hudson commented on HDFS-7748:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #883 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/883/])
HDFS-7748. Separate ECN flags from the Status in the DataTransferPipelineAck. 
Contributed by Anu Engineer and Haohui Mai. (wheat9: rev 
b80457158daf0dc712fbe5695625cc17d70d4bb4)
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
Addendum for HDFS-7748. (wheat9: rev 0967b1d99d7001cd1d09ebd29b9360f1079410e8)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java


 Separate ECN flags from the Status in the DataTransferPipelineAck
 -

 Key: HDFS-7748
 URL: https://issues.apache.org/jira/browse/HDFS-7748
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Anu Engineer
Priority: Blocker
 Attachments: HDFS-7748.007-addendum.patch, HDFS-7748.007.patch, 
 hdfs-7748.001.patch, hdfs-7748.002.patch, hdfs-7748.003.patch, 
 hdfs-7748.004.patch, hdfs-7748.005.patch, hdfs-7748.006.patch, 
 hdfs-7748.branch-2.7.006.patch


 Prior to the discussions on HDFS-7270, the old clients might fail to talk to 
 the newer server when ECN is turned on. This jira proposes to separate the 
 ECN flags in a separate protobuf field to make the ack compatible on both 
 versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7939) Two fsimage_rollback_* files are created which are not deleted after rollback.

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388445#comment-14388445
 ] 

Hadoop QA commented on HDFS-7939:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708145/HDFS-7939.1.patch
  against trunk revision 85dc3c1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10127//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10127//console

This message is automatically generated.

 Two fsimage_rollback_* files are created which are not deleted after rollback.
 --

 Key: HDFS-7939
 URL: https://issues.apache.org/jira/browse/HDFS-7939
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: J.Andreina
Assignee: J.Andreina
Priority: Critical
 Attachments: HDFS-7939.1.patch


 During checkpoint , if any failure in uploading to the remote Namenode  then 
 restarting Namenode with rollingUpgrade started option creates 2 
 fsimage_rollback_* at Active Namenode .
 On rolling upgrade rollback , initially created fsimage_rollback_* file is 
 not been deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7939) Two fsimage_rollback_* files are created which are not deleted after rollback.

2015-03-31 Thread J.Andreina (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388459#comment-14388459
 ] 

J.Andreina commented on HDFS-7939:
--

Testcase failures are not related to this path.
Please review the patch.

 Two fsimage_rollback_* files are created which are not deleted after rollback.
 --

 Key: HDFS-7939
 URL: https://issues.apache.org/jira/browse/HDFS-7939
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: J.Andreina
Assignee: J.Andreina
Priority: Critical
 Attachments: HDFS-7939.1.patch


 During checkpoint , if any failure in uploading to the remote Namenode  then 
 restarting Namenode with rollingUpgrade started option creates 2 
 fsimage_rollback_* at Active Namenode .
 On rolling upgrade rollback , initially created fsimage_rollback_* file is 
 not been deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8009) Signal congestion on the DataNode

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389847#comment-14389847
 ] 

Hadoop QA commented on HDFS-8009:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708538/HDFS-8009.000.patch
  against trunk revision e428fea.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The test build failed in 
hadoop-hdfs-project/hadoop-hdfs 

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10133//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10133//console

This message is automatically generated.

 Signal congestion on the DataNode
 -

 Key: HDFS-8009
 URL: https://issues.apache.org/jira/browse/HDFS-8009
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-8009.000.patch


 The DataNode should signal congestion (i.e. I'm too busy) in the 
 PipelineAck using the mechanism introduced in HDFS-7270.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-8020) Erasure Coding: restore BlockGroup and schema info from stripping coding command

2015-03-31 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki reassigned HDFS-8020:


Assignee: Kai Sasaki  (was: Kai Zheng)

 Erasure Coding: restore BlockGroup and schema info from stripping coding 
 command
 

 Key: HDFS-8020
 URL: https://issues.apache.org/jira/browse/HDFS-8020
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Zheng
Assignee: Kai Sasaki

 As a task of HDFS-7344, to process *stripping* coding commands from NameNode 
 or other scheduler services/tools, we need to first be able to restore 
 BlockGroup and schema information in DataNode, which will be used to 
 construct coding work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8035) Move checking replication and get client DN to BM and DM respectively

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389944#comment-14389944
 ] 

Hadoop QA commented on HDFS-8035:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708569/HDFS-8035.000.patch
  against trunk revision 2daa478.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestReplication
  org.apache.hadoop.hdfs.TestLeaseRecovery
  org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshot
  org.apache.hadoop.hdfs.TestPread
  
org.apache.hadoop.hdfs.server.namenode.TestFavoredNodesEndToEnd
  
org.apache.hadoop.hdfs.server.namenode.ha.TestHAStateTransitions
  
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
  
org.apache.hadoop.hdfs.server.datanode.TestBlockHasMultipleReplicasOnSameDN
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotFileLength
  org.apache.hadoop.hdfs.TestSafeMode
  
org.apache.hadoop.hdfs.server.namenode.TestNamenodeCapacityReport
  org.apache.hadoop.hdfs.server.namenode.TestNamenodeRetryCache
  org.apache.hadoop.hdfs.TestParallelShortCircuitRead
  org.apache.hadoop.hdfs.server.namenode.TestFSEditLogLoader
  org.apache.hadoop.hdfs.TestFSInputChecker
  
org.apache.hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks
  org.apache.hadoop.hdfs.tools.TestDebugAdmin
  org.apache.hadoop.hdfs.TestSetrepIncreasing
  org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics
  
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistFiles
  org.apache.hadoop.fs.TestEnhancedByteBufferAccess
  
org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup
  org.apache.hadoop.hdfs.TestMultiThreadedHflush
  org.apache.hadoop.hdfs.TestParallelRead
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestSetQuotaWithSnapshot
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestNestedSnapshots
  org.apache.hadoop.hdfs.TestEncryptionZonesWithKMS
  org.apache.hadoop.hdfs.tools.TestStoragePolicyCommands
  org.apache.hadoop.hdfs.TestDFSRemove
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot
  org.apache.hadoop.hdfs.TestHFlush
  org.apache.hadoop.hdfs.server.namenode.TestHDFSConcat
  
org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics
  org.apache.hadoop.hdfs.TestSetTimes
  org.apache.hadoop.hdfs.server.namenode.TestAddBlock
  
org.apache.hadoop.hdfs.server.datanode.TestDnRespectsBlockReportSplitThreshold
  org.apache.hadoop.hdfs.TestMissingBlocksAlert
  org.apache.hadoop.hdfs.TestParallelShortCircuitReadNoChecksum
  org.apache.hadoop.hdfs.TestBlocksScheduledCounter
  org.apache.hadoop.hdfs.TestEncryptedTransfer
  org.apache.hadoop.hdfs.server.namenode.TestNameEditsConfigs
  org.apache.hadoop.hdfs.server.mover.TestMover
  
org.apache.hadoop.hdfs.server.namenode.snapshot.TestUpdatePipelineWithSnapshots
  org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode
  org.apache.hadoop.fs.TestUnbuffer
  org.apache.hadoop.hdfs.TestParallelShortCircuitLegacyRead
  org.apache.hadoop.hdfs.TestQuota
  org.apache.hadoop.hdfs.TestDFSClientFailover
  

[jira] [Assigned] (HDFS-8037) WebHDFS: CheckAccess silently accepts certain malformed FsActions

2015-03-31 Thread Walter Su (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Walter Su reassigned HDFS-8037:
---

Assignee: Walter Su

 WebHDFS: CheckAccess silently accepts certain malformed FsActions
 -

 Key: HDFS-8037
 URL: https://issues.apache.org/jira/browse/HDFS-8037
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.6.0
Reporter: Jake Low
Assignee: Walter Su
Priority: Minor
  Labels: easyfix, newbie

 WebHDFS's {{CHECKACCESS}} operation accepts a parameter called {{fsaction}}, 
 which represents the type(s) of access to check for.
 According to the documentation, and also the source code, the domain of 
 {{fsaction}} is the set of strings matched by the regex {{\[rwx-\]{3\}}}. 
 This domain is wider than the set of valid {{FsAction}} objects, because it 
 doesn't guarantee sensible ordering of access types. For example, the strings 
 {{rxw}} and {{--r}} are valid {{fsaction}} parameter values, but don't 
 correspond to valid {{FsAction}} instances.
 The result is that WebHDFS silently accepts {{fsaction}} parameter values 
 which don't match any valid {{FsAction}} instance, but doesn't actually 
 perform any permissions checking in this case.
 For example, here's a {{CHECKACCESS}} call where we request {{rw-}} access 
 on a file which we only have permission to read and execute. It raises an 
 exception, as it should.
 {code:none}
 curl -i -X GET 
 http://localhost:50070/webhdfs/v1/myfile?op=CHECKACCESSuser.name=nobodyfsaction=r-x;
 HTTP/1.1 403 Forbidden
 Content-Type: application/json
 {
   RemoteException: {
 exception: AccessControlException,
 javaClassName: org.apache.hadoop.security.AccessControlException,
 message: Permission denied: user=nobody, access=READ_WRITE, 
 inode=\\/myfile\:root:supergroup:drwxr-xr-x
   }
 }
 {code}
 But if we instead request {{r-w}} access, the call appears to succeed:
 {code:none}
 curl -X GET 
 http://localhost:50070/webhdfs/v1/myfile?op=CHECKACCESSuser.name=nobodyfsaction=r-w;
 HTTP/1.1 200 OK
 Content-Length: 0
 {code}
 As I see it, the fix would be to change the regex pattern in 
 {{FsActionParam}} to something like {{\[r-\]\[w-\]\[x-\]}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7937) Erasure Coding: INodeFile quota computation unit tests

2015-03-31 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389967#comment-14389967
 ] 

Rakesh R commented on HDFS-7937:


Thanks [~kaisasak]!

In the latest patch {{INodeFile#computeQuotaUsageWithStriped}} related changes 
are missing, could you please tell me any specific reason for this?. Apart from 
this latest patch looks pretty good.

Also, one general observation - instead of {{SubmitPatch}} can we do 
{{StartProgress}}. This would avoid triggering the Jenkins and then adding 
Hudson QA comment in jira. Like [~zhz] mentioned jenkins only work in {{trunk}}

 Erasure Coding: INodeFile quota computation unit tests
 --

 Key: HDFS-7937
 URL: https://issues.apache.org/jira/browse/HDFS-7937
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Sasaki
Assignee: Kai Sasaki
Priority: Minor
 Attachments: HDFS-7937.1.patch, HDFS-7937.2.patch, HDFS-7937.3.patch


 Unit test for [HDFS-7826|https://issues.apache.org/jira/browse/HDFS-7826]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7888) Change DataStreamer/DFSOutputStream/DFSPacket for convenience of subclassing

2015-03-31 Thread Li Bo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389983#comment-14389983
 ] 

Li Bo commented on HDFS-7888:
-

That's a very good improvement of the patch. I will also update the patch of 
HDFS-7889

 Change DataStreamer/DFSOutputStream/DFSPacket for convenience of subclassing
 

 Key: HDFS-7888
 URL: https://issues.apache.org/jira/browse/HDFS-7888
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7888-001.patch, HDFS-7888-trunk-001.patch, 
 HDFS-7888-trunk-002.patch


 HDFS-7793 refactors class {{DFSOutputStream}} on trunk which makes 
 {{DFSOutputStream}} a class without any inner classes. We want to subclass 
 {{DFSOutputStream}} to support striping layout writing. This JIRA depends 
 upon HDFS-7793 and tries to change DataStreamer/DFSOutputStream/DFSPacket for 
 convenience of subclassing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7889) Subclass DFSOutputStream to support writing striping layout files

2015-03-31 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-7889:

Attachment: HDFS-7889-006.patch

 Subclass DFSOutputStream to support writing striping layout files
 -

 Key: HDFS-7889
 URL: https://issues.apache.org/jira/browse/HDFS-7889
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7889-001.patch, HDFS-7889-002.patch, 
 HDFS-7889-003.patch, HDFS-7889-004.patch, HDFS-7889-005.patch, 
 HDFS-7889-006.patch


 After HDFS-7888, we can subclass  {{DFSOutputStream}} to support writing 
 striping layout files. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8008) Support client-side back off when the datanodes are congested

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389808#comment-14389808
 ] 

Hadoop QA commented on HDFS-8008:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708532/HDFS-8008.000.patch
  against trunk revision e428fea.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10132//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10132//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10132//console

This message is automatically generated.

 Support client-side back off when the datanodes are congested
 -

 Key: HDFS-8008
 URL: https://issues.apache.org/jira/browse/HDFS-8008
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-8008.000.patch


 HDFS-7270 introduces the mechanism for DataNode to signal congestions. 
 DFSClient should be able to recognize the signals and back off.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8033) Erasure coding: stateful (non-positional) read from files in striped layout

2015-03-31 Thread GAO Rui (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389891#comment-14389891
 ] 

GAO Rui commented on HDFS-8033:
---

[~zhz] 
bq. stateful (non-positional) read 
means reading the whole file without any position requirement?

 Erasure coding: stateful (non-positional) read from files in striped layout
 ---

 Key: HDFS-8033
 URL: https://issues.apache.org/jira/browse/HDFS-8033
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8034) Fix TestDFSClientRetries#testDFSClientConfigurationLocateFollowingBlockInitialDelay for Windows

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389950#comment-14389950
 ] 

Hadoop QA commented on HDFS-8034:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708565/HDFS-8034.00.patch
  against trunk revision 18a91fe.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10134//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10134//console

This message is automatically generated.

 Fix 
 TestDFSClientRetries#testDFSClientConfigurationLocateFollowingBlockInitialDelay
  for Windows 
 

 Key: HDFS-8034
 URL: https://issues.apache.org/jira/browse/HDFS-8034
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao
 Attachments: HDFS-8034.00.patch


 TestDFSClientRetries#testDFSClientConfigurationLocateFollowingBlockInitialDelay
  failed subsequent tests on Windows because this test case fails to shutdown 
 the MiniDFS cluster. I will post a patch for it shortly.
 {code}
 testRetryOnChecksumFailure(org.apache.hadoop.hdfs.TestDFSClientRetries)  Time 
 elapsed: 0.012 sec   ERROR!
 java.io.IOException: Could not fully delete 
 D:\w\hbk\hadoop-hdfs-project\hadoop-hdfs\target\test\data\dfs\name1
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.createNameNodesAndSetConf(MiniDFSCluster.java:943)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:814)
   at org.apache.hadoop.hdfs.MiniDFSCluster.init(MiniDFSCluster.java:473)
   at 
 org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:432)
   at 
 org.apache.hadoop.hdfs.TestDFSClientRetries.testRetryOnChecksumFailure(TestDFSClientRetries.java:1091)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8036) Use snapshot path as source when using snapshot diff report in DistCp

2015-03-31 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8036:

Attachment: HDFS-8036.000.patch

Initial patch to fix.

 Use snapshot path as source when using snapshot diff report in DistCp
 -

 Key: HDFS-8036
 URL: https://issues.apache.org/jira/browse/HDFS-8036
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: distcp
Affects Versions: 2.7.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8036.000.patch


 When using snapshot diff report for distcp (HDFS-7535), the semantic should 
 be apply the diff to the target in order to sync the target with 
 source@snapshot2. Therefore after syncing based on the snapshot diff report, 
 we should append the name of snapshot2 to the original source path and use it 
 as the new source name. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8037) WebHDFS: CheckAccess silently accepts certain malformed FsActions

2015-03-31 Thread Jake Low (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jake Low updated HDFS-8037:
---
Description: 
WebHDFS's {{CHECKACCESS}} operation accepts a parameter called {{fsaction}}, 
which represents the type(s) of access to check for.

According to the documentation, and also the source code, the domain of 
{{fsaction}} is the set of strings matched by the regex {{\[rwx-\]{3\}}}. 
This domain is wider than the set of valid {{FsAction}} objects, because it 
doesn't guarantee sensible ordering of access types. For example, the strings 
{{rxw}} and {{--r}} are valid {{fsaction}} parameter values, but don't 
correspond to valid {{FsAction}} instances.

The result is that WebHDFS silently accepts {{fsaction}} parameter values which 
don't match any valid {{FsAction}} instance, but doesn't actually perform any 
permissions checking in this case.

For example, here's a {{CHECKACCESS}} call where we request {{rw-}} access on 
a file which we only have permission to read and execute. It raises an 
exception, as it should.

{code:none}
curl -i -X GET 
http://localhost:50070/webhdfs/v1/myfile?op=CHECKACCESSuser.name=nobodyfsaction=r-x;

HTTP/1.1 403 Forbidden
Content-Type: application/json

{
  RemoteException: {
exception: AccessControlException,
javaClassName: org.apache.hadoop.security.AccessControlException,
message: Permission denied: user=nobody, access=READ_WRITE, 
inode=\\/myfile\:root:supergroup:drwxr-xr-x
  }
}
{code}

But if we instead request {{r-w}} access, the call appears to succeed:

{code:none}
curl -i -X GET 
http://localhost:50070/webhdfs/v1/myfile?op=CHECKACCESSuser.name=nobodyfsaction=r-w;

HTTP/1.1 200 OK
Content-Length: 0
{code}

As I see it, the fix would be to change the regex pattern in {{FsActionParam}} 
to something like {{\[r-\]\[w-\]\[x-\]}}.

  was:
WebHDFS's {{CHECKACCESS}} operation accepts a parameter called {{fsaction}}, 
which represents the type(s) of access to check for.

According to the documentation, and also the source code, the domain of 
{{fsaction}} is the set of strings matched by the regex {{\[rwx-\]{3\}}}. 
This domain is wider than the set of valid {{FsAction}} objects, because it 
doesn't guarantee sensible ordering of access types. For example, the strings 
{{rxw}} and {{--r}} are valid {{fsaction}} parameter values, but don't 
correspond to valid {{FsAction}} instances.

The result is that WebHDFS silently accepts {{fsaction}} parameter values which 
don't match any valid {{FsAction}} instance, but doesn't actually perform any 
permissions checking in this case.

For example, here's a {{CHECKACCESS}} call where we request {{rw-}} access on 
a file which we only have permission to read and execute. It raises an 
exception, as it should.

{code:none}
curl -i -X GET 
http://localhost:50070/webhdfs/v1/myfile?op=CHECKACCESSuser.name=nobodyfsaction=r-x;

HTTP/1.1 403 Forbidden
Content-Type: application/json

{
  RemoteException: {
exception: AccessControlException,
javaClassName: org.apache.hadoop.security.AccessControlException,
message: Permission denied: user=nobody, access=READ_WRITE, 
inode=\\/myfile\:root:supergroup:drwxr-xr-x
  }
}
{code}

But if we instead request {{r-w}} access, the call appears to succeed:

{code:none}
curl -X GET 
http://localhost:50070/webhdfs/v1/myfile?op=CHECKACCESSuser.name=nobodyfsaction=r-w;

HTTP/1.1 200 OK
Content-Length: 0
{code}

As I see it, the fix would be to change the regex pattern in {{FsActionParam}} 
to something like {{\[r-\]\[w-\]\[x-\]}}.


 WebHDFS: CheckAccess silently accepts certain malformed FsActions
 -

 Key: HDFS-8037
 URL: https://issues.apache.org/jira/browse/HDFS-8037
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.6.0
Reporter: Jake Low
Assignee: Walter Su
Priority: Minor
  Labels: easyfix, newbie

 WebHDFS's {{CHECKACCESS}} operation accepts a parameter called {{fsaction}}, 
 which represents the type(s) of access to check for.
 According to the documentation, and also the source code, the domain of 
 {{fsaction}} is the set of strings matched by the regex {{\[rwx-\]{3\}}}. 
 This domain is wider than the set of valid {{FsAction}} objects, because it 
 doesn't guarantee sensible ordering of access types. For example, the strings 
 {{rxw}} and {{--r}} are valid {{fsaction}} parameter values, but don't 
 correspond to valid {{FsAction}} instances.
 The result is that WebHDFS silently accepts {{fsaction}} parameter values 
 which don't match any valid {{FsAction}} instance, but doesn't actually 
 perform any permissions checking in this case.
 For example, here's a {{CHECKACCESS}} call where we request {{rw-}} access 
 on a file which we only have permission to read and execute. It raises 

[jira] [Commented] (HDFS-7937) Erasure Coding: INodeFile quota computation unit tests

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389867#comment-14389867
 ] 

Hadoop QA commented on HDFS-7937:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708581/HDFS-7937.3.patch
  against trunk revision 2daa478.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.

Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10137//console

This message is automatically generated.

 Erasure Coding: INodeFile quota computation unit tests
 --

 Key: HDFS-7937
 URL: https://issues.apache.org/jira/browse/HDFS-7937
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Sasaki
Assignee: Kai Sasaki
Priority: Minor
 Attachments: HDFS-7937.1.patch, HDFS-7937.2.patch, HDFS-7937.3.patch


 Unit test for [HDFS-7826|https://issues.apache.org/jira/browse/HDFS-7826]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6666) Abort NameNode and DataNode startup if security is enabled but block access token is not enabled.

2015-03-31 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389948#comment-14389948
 ] 

Arpit Agarwal commented on HDFS-:
-

Hi [~vijaysbhat], thank you for volunteering to help with this issue and adding 
a test case.

You will need to enable the Maven startKdc profile for running secure NN tests. 
Secure NN uses ApacheDS but unfortunately the URL is broken. Looks like we'll 
need to fix the download URL to get startKdc working. Do you want to give it a 
shot too?

{code}
$ mvn -q test -PtestKerberos,startKdc -Dtest=TestSecureNameNode
 [exec] Result: 1
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-antrun-plugin:1.7:run (kdc) on project 
hadoop-common: An Ant BuildException has occured: Can't get 
http://newverhost.com/pub//directory/apacheds/unstable/1.5/1.5.7/apacheds-1.5.7.tar.gz
 to 
/Users/aagarwal/src/hdp/hadoop-common-project/hadoop-common/target/test-classes/kdc/downloads/apacheds-1.5.7.tar.gz
[ERROR] around Ant part ...get 
dest=/Users/aagarwal/src/hdp/hadoop-common-project/hadoop-common/target/test-classes/kdc/downloads
 skipexisting=true verbose=true 
src=http://newverhost.com/pub//directory/apacheds/unstable/1.5/1.5.7/apacheds-1.5.7.tar.gz/..
{code}

 Abort NameNode and DataNode startup if security is enabled but block access 
 token is not enabled.
 -

 Key: HDFS-
 URL: https://issues.apache.org/jira/browse/HDFS-
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode, namenode, security
Affects Versions: 3.0.0, 2.5.0
Reporter: Chris Nauroth
Assignee: Vijay Bhat
Priority: Minor

 Currently, if security is enabled by setting hadoop.security.authentication 
 to kerberos, but HDFS block access tokens are disabled by setting 
 dfs.block.access.token.enable to false (which is the default), then the 
 NameNode logs an error and proceeds, and the DataNode proceeds without even 
 logging an error.  This jira proposes that this it's invalid to turn on 
 security but not turn on block access tokens, and that it would be better to 
 fail fast and abort the daemons during startup if this happens.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8037) WebHDFS: CheckAccess silently accepts certain malformed FsActions

2015-03-31 Thread Jake Low (JIRA)
Jake Low created HDFS-8037:
--

 Summary: WebHDFS: CheckAccess silently accepts certain malformed 
FsActions
 Key: HDFS-8037
 URL: https://issues.apache.org/jira/browse/HDFS-8037
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: webhdfs
Affects Versions: 2.6.0
Reporter: Jake Low
Priority: Minor


WebHDFS's {{CHECKACCESS}} operation accepts a parameter called {{fsaction}}, 
which represents the type(s) of access to check for.

According to the documentation, and also the source code, the domain of 
{{fsaction}} is the set of strings matched by the regex {{\[rwx-\]{3\}}}. 
This domain is wider than the set of valid {{FsAction}} objects, because it 
doesn't guarantee sensible ordering of access types. For example, the strings 
{{rxw}} and {{--r}} are valid {{fsaction}} parameter values, but don't 
correspond to valid {{FsAction}} instances.

The result is that WebHDFS silently accepts {{fsaction}} parameter values which 
don't match any valid {{FsAction}} instance, but doesn't actually perform any 
permissions checking in this case.

For example, here's a {{CHECKACCESS}} call where we request {{rw-}} access on 
a file which we only have permission to read and execute. It raises an 
exception, as it should.

{code:none}
curl -i -X GET 
http://localhost:50070/webhdfs/v1/myfile?op=CHECKACCESSuser.name=nobodyfsaction=r-x;

HTTP/1.1 403 Forbidden
Content-Type: application/json

{
  RemoteException: {
exception: AccessControlException,
javaClassName: org.apache.hadoop.security.AccessControlException,
message: Permission denied: user=nobody, access=READ_WRITE, 
inode=\\/myfile\:root:supergroup:drwxr-xr-x
  }
}
{code}

But if we instead request {{r-w}} access, the call appears to succeed:

{code:none}
curl -X GET 
http://localhost:50070/webhdfs/v1/myfile?op=CHECKACCESSuser.name=nobodyfsaction=r-w;

HTTP/1.1 200 OK
Content-Length: 0
{code}

As I see it, the fix would be to change the regex pattern in {{FsActionParam}} 
to something like {{\[r-\]\[w-\]\[x-\]}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8036) Use snapshot path as source when using snapshot diff report in DistCp

2015-03-31 Thread Jing Zhao (JIRA)
Jing Zhao created HDFS-8036:
---

 Summary: Use snapshot path as source when using snapshot diff 
report in DistCp
 Key: HDFS-8036
 URL: https://issues.apache.org/jira/browse/HDFS-8036
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: distcp
Affects Versions: 2.7.0
Reporter: Jing Zhao
Assignee: Jing Zhao


When using snapshot diff report for distcp (HDFS-7535), the semantic should be 
apply the diff to the target in order to sync the target with source@snapshot2. 
Therefore after syncing based on the snapshot diff report, we should append the 
name of snapshot2 to the original source path and use it as the new source 
name. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7937) Erasure Coding: INodeFile quota computation unit tests

2015-03-31 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated HDFS-7937:
-
Attachment: HDFS-7937.3.patch

 Erasure Coding: INodeFile quota computation unit tests
 --

 Key: HDFS-7937
 URL: https://issues.apache.org/jira/browse/HDFS-7937
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Sasaki
Assignee: Kai Sasaki
Priority: Minor
 Attachments: HDFS-7937.1.patch, HDFS-7937.2.patch, HDFS-7937.3.patch


 Unit test for [HDFS-7826|https://issues.apache.org/jira/browse/HDFS-7826]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7937) Erasure Coding: INodeFile quota computation unit tests

2015-03-31 Thread Kai Sasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kai Sasaki updated HDFS-7937:
-
Status: Patch Available  (was: Open)

 Erasure Coding: INodeFile quota computation unit tests
 --

 Key: HDFS-7937
 URL: https://issues.apache.org/jira/browse/HDFS-7937
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Sasaki
Assignee: Kai Sasaki
Priority: Minor
 Attachments: HDFS-7937.1.patch, HDFS-7937.2.patch, HDFS-7937.3.patch


 Unit test for [HDFS-7826|https://issues.apache.org/jira/browse/HDFS-7826]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7889) Subclass DFSOutputStream to support writing striping layout files

2015-03-31 Thread Li Bo (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389987#comment-14389987
 ] 

Li Bo commented on HDFS-7889:
-

Patch 006 removes {{getStreamer()}} and switches streamer in 
{{DFSStripedOutputStream}} 

 Subclass DFSOutputStream to support writing striping layout files
 -

 Key: HDFS-7889
 URL: https://issues.apache.org/jira/browse/HDFS-7889
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Attachments: HDFS-7889-001.patch, HDFS-7889-002.patch, 
 HDFS-7889-003.patch, HDFS-7889-004.patch, HDFS-7889-005.patch, 
 HDFS-7889-006.patch


 After HDFS-7888, we can subclass  {{DFSOutputStream}} to support writing 
 striping layout files. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8035) Move checking replication and get client DN to BM and DM respectively

2015-03-31 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai updated HDFS-8035:
-
Attachment: HDFS-8035.001.patch

 Move checking replication and get client DN to BM and DM respectively
 -

 Key: HDFS-8035
 URL: https://issues.apache.org/jira/browse/HDFS-8035
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-8035.000.patch, HDFS-8035.001.patch


 There are functionality in {{FSNameSystem}} to check replication and to get 
 datanode based on the client name. This jira proposes to move these 
 functionality to {{BlockManager}} and {{DatanodeManager}} respectively.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8036) Use snapshot path as source when using snapshot diff report in DistCp

2015-03-31 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-8036:

Status: Patch Available  (was: Open)

 Use snapshot path as source when using snapshot diff report in DistCp
 -

 Key: HDFS-8036
 URL: https://issues.apache.org/jira/browse/HDFS-8036
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: distcp
Affects Versions: 2.7.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8036.000.patch


 When using snapshot diff report for distcp (HDFS-7535), the semantic should 
 be apply the diff to the target in order to sync the target with 
 source@snapshot2. Therefore after syncing based on the snapshot diff report, 
 we should append the name of snapshot2 to the original source path and use it 
 as the new source name. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8036) Use snapshot path as source when using snapshot diff report in DistCp

2015-03-31 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14389855#comment-14389855
 ] 

Hadoop QA commented on HDFS-8036:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12708575/HDFS-8036.000.patch
  against trunk revision 2daa478.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-tools/hadoop-distcp.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10136//testReport/
Console output: 
https://builds.apache.org/job/PreCommit-HDFS-Build/10136//console

This message is automatically generated.

 Use snapshot path as source when using snapshot diff report in DistCp
 -

 Key: HDFS-8036
 URL: https://issues.apache.org/jira/browse/HDFS-8036
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: distcp
Affects Versions: 2.7.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-8036.000.patch


 When using snapshot diff report for distcp (HDFS-7535), the semantic should 
 be apply the diff to the target in order to sync the target with 
 source@snapshot2. Therefore after syncing based on the snapshot diff report, 
 we should append the name of snapshot2 to the original source path and use it 
 as the new source name. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8004) Use KeyProviderCryptoExtension#warmUpEncryptedKeys when creating an encryption zone

2015-03-31 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-8004:
--
Fix Version/s: 2.8.0

thanks Arun, don't forget to set the fix version ;)

 Use KeyProviderCryptoExtension#warmUpEncryptedKeys when creating an 
 encryption zone
 ---

 Key: HDFS-8004
 URL: https://issues.apache.org/jira/browse/HDFS-8004
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: encryption
Affects Versions: 2.6.0
Reporter: Andrew Wang
Assignee: Andrew Wang
Priority: Trivial
 Fix For: 2.8.0

 Attachments: hdfs-8004.001.patch


 It'd be slightly better to use the provided warm-up method, even though what 
 we do now (getting and throwing away a key) is functionally the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8011) standby nn can't started

2015-03-31 Thread Yongjun Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388618#comment-14388618
 ] 

Yongjun Zhang commented on HDFS-8011:
-

Hi [~fujie],

I am sure that the file was deleted, if a file was deleted, and it still has 
an OP_CLOSE in the edit log file, then why If we restart SNN A again, 
editlog-file-2 could be loaded correctly just like editlog-file-1 in last 
restart operation. is indeed mysterious, unless OP_CLOSE silently ignores 
deleted file.

Can we dump the edit log with oev tool, and see if the involved file in 
OP_CLOSE operation that throws NPE was deleted (either it OR its parent has an 
OP_DELETE) before the OP_CLOSE? 

What it means by 20,000 operations failed in 500,000 operations?  what are 
the error symptom? As Vinayakumar requested, can we analysze the trace stack of 
all failures to see if they have the same exception stack? 

Since you mentioned one problem OP_ADD_BLOCK, it seems that we are adding block 
to a deleted file? If it's deleted file, I think it's very likely related to 
delayed block removal, which relates to at the same time datanode will report 
heartbeat to both active and standby.

Thanks.





 standby nn can't started
 

 Key: HDFS-8011
 URL: https://issues.apache.org/jira/browse/HDFS-8011
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.3.0
 Environment: centeros 6.2  64bit 
Reporter: fujie

 We have seen crash when starting the standby namenode, with fatal errors. Any 
 solutions, workarouds, or ideas would be helpful for us.
 1. Here is the context: 
   At begining we have 2 namenodes, take A as active and B as standby. For 
 some resons, namenode A was dead, so namenode B is working as active.
   When we try to restart A after a minute, it can't work. During this 
 time a lot of files were put to HDFS, and a lot of files were renamed. 
   Nodenode A crashed when awaiting reported blocks in safemode each 
 time.
  
 2. We can see error log below:
   1)2015-03-30  ERROR 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: Encountered exception 
 on operation CloseOp [length=0, inodeId=0, 
 path=/xxx/_temporary/xxx/part-r-00074.bz2, replication=3, 
 mtime=1427699913947, atime=1427699081161, blockSize=268435456, 
 blocks=[blk_2103131025_1100889495739], permissions=dm:dm:rw-r--r--, 
 clientName=, clientMachine=, opCode=OP_CLOSE, txid=7632753612]
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction.setGenerationStampAndVerifyReplicas(BlockInfoUnderConstruction.java:247)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction.commitBlock(BlockInfoUnderConstruction.java:267)
 at 
 org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.forceCompleteBlock(BlockManager.java:639)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.updateBlocks(FSEditLogLoader.java:813)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:383)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:209)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:122)
 at 
 org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:737)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:227)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:321)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$0(EditLogTailer.java:302)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:296)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:356)
 at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1528)
 at 
 org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:413)
 at 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:292)
 
2)2015-03-30  FATAL 
 org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Unknown error 
 encountered while tailing edits. Shutting down standby N
 N.
 java.io.IOException: Failed to apply edit log operation AddBlockOp 
 [path=/xxx/_temporary/xxx/part-m-00121, 
 penultimateBlock=blk_2102331803_1100888911441, 
 lastBlock=blk_2102661068_1100889009168, RpcClientId=, RpcCallId=-2]: error
 null
 at 
 

[jira] [Commented] (HDFS-8002) Website refers to /trash directory

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388627#comment-14388627
 ] 

Hudson commented on HDFS-8002:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #140 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/140/])
HDFS-8002. Website refers to /trash directory. Contributd by Brahma Reddy 
Battula. (aajisaka: rev e7ea2a8e8f0a7b428ef10552885757b99b59e4dc)
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 Website refers to /trash directory
 --

 Key: HDFS-8002
 URL: https://issues.apache.org/jira/browse/HDFS-8002
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: documentation
Reporter: Mike Drob
Assignee: Brahma Reddy Battula
 Fix For: 2.8.0

 Attachments: HDFS-8002.patch, HDFS-8003-002.patch


 On 
 http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#File_Deletes_and_Undeletes
  the section on trash refers to files residing in {{/trash}}.
 I think this is an error, as files actually go to user specific trash 
 directories like {{/user/hdfs/.Trash}}
 Either the site needs to be updated to mention user specific directories, or 
 if this is a change from previous behaviour then maybe that can be mentioned 
 instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8029) NPE during disk usage calculation on snapshot directory, after a sub folder is deleted

2015-03-31 Thread kanaka kumar avvaru (JIRA)
kanaka kumar avvaru created HDFS-8029:
-

 Summary: NPE during disk usage calculation on snapshot directory, 
after a sub folder is deleted
 Key: HDFS-8029
 URL: https://issues.apache.org/jira/browse/HDFS-8029
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: kanaka kumar avvaru
Assignee: kanaka kumar avvaru


ContentSummary computation is causing NullPointerException on snapshot 
directory if some sub directory is deleted.
  Following are the steps to reproduce the issue.
  
  1. Create a root directory /test
  2. Create sub dir named as  /test/sub1
  3. Create sub dir in sub1 as  /test/sub1/sub2
  4. Create a file at  /test/sub1/file1
  5. Create a file at /test/sub1/sub2/file1
  6. Enable shotshot on sub1 (hadoop dfsadmin -allowSnapshot test/sub1)
  7. Create snapshot1 on /test/sub1
  8. Delete directory /test/sub1/sub2 (recursively)
  9. Create  snapshot2 on /test/sub1
  10. Execute du command on /test (hadoop fs -du  /test/) 
  
  
  Gives NullPointerException in CLI. NameNode logs the exception as
  ... java.lang.NullPointerExceptionat 
org.apache.hadoop.hdfs.server.namenode.ContentSummaryComputationContext.getBlockStoragePolicySuite(ContentSummaryComputationContext.java:122)
 ...
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7671) hdfs user guide should point to the common rack awareness doc

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388717#comment-14388717
 ] 

Hudson commented on HDFS-7671:
--

FAILURE: Integrated in Hadoop-trunk-Commit #7476 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/7476/])
HDFS-7671. hdfs user guide should point to the common rack awareness doc. 
Contributed by Kai Sasaki. (aajisaka: rev 
859cab2f2273f563fd70e3e616758edef91ccf41)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsUserGuide.md


 hdfs user guide should point to the common rack awareness doc
 -

 Key: HDFS-7671
 URL: https://issues.apache.org/jira/browse/HDFS-7671
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: documentation
Reporter: Allen Wittenauer
Assignee: Kai Sasaki
 Fix For: 2.8.0

 Attachments: HDFS-7671.1.patch, HDFS-7671.2.patch, HDFS-7671.3.patch


 HDFS user guide has a section on rack awareness that should really just be a 
 pointer to the common doc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8010) Erasure coding: extend UnderReplicatedBlocks to accurately handle striped blocks

2015-03-31 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388555#comment-14388555
 ] 

Rakesh R commented on HDFS-8010:


Minor mistake in my above comments, please read the suggested way as :

{code}
 private boolean veryUnderReplicated(int curReplicas, int expectedReplicas,
  boolean isStriped) {
if (!isStriped) {
  return (curReplicas * 3)  expectedReplicas;
} else {
  return curReplicas = HdfsConstants.NUM_DATA_BLOCKS + 2;
}
  }

  private boolean highestPrioirty(int curReplicas, boolean isStriped) {
if(!isStriped){
  return curReplicas == 1;
} else {
  return curReplicas == HdfsConstants.NUM_DATA_BLOCKS;
}
  }
{code}

 Erasure coding: extend UnderReplicatedBlocks to accurately handle striped 
 blocks
 

 Key: HDFS-8010
 URL: https://issues.apache.org/jira/browse/HDFS-8010
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang
 Attachments: HDFS-8010-000.patch


 This JIRA tracks efforts to accurately assess the _risk level_ of a striped 
 block groups with missing blocks, when added to {{UnderReplicatedBlocks}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-6634) inotify in HDFS

2015-03-31 Thread Benoit Perroud (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388642#comment-14388642
 ] 

Benoit Perroud commented on HDFS-6634:
--

We started to migrate our code to this implementation. It's just awesome. 
Thanks a lot [~james.thomas] for the work!

I still have a quick question: any reason why the transaction id is not 
embedded in the Event object?



 inotify in HDFS
 ---

 Key: HDFS-6634
 URL: https://issues.apache.org/jira/browse/HDFS-6634
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs-client, namenode, qjm
Reporter: James Thomas
Assignee: James Thomas
 Fix For: 2.6.0

 Attachments: HDFS-6634.2.patch, HDFS-6634.3.patch, HDFS-6634.4.patch, 
 HDFS-6634.5.patch, HDFS-6634.6.patch, HDFS-6634.7.patch, HDFS-6634.8.patch, 
 HDFS-6634.9.patch, HDFS-6634.patch, inotify-design.2.pdf, 
 inotify-design.3.pdf, inotify-design.4.pdf, inotify-design.pdf, 
 inotify-intro.2.pdf, inotify-intro.pdf


 Design a mechanism for applications like search engines to access the HDFS 
 edit stream.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7748) Separate ECN flags from the Status in the DataTransferPipelineAck

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388631#comment-14388631
 ] 

Hudson commented on HDFS-7748:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #140 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/140/])
HDFS-7748. Separate ECN flags from the Status in the DataTransferPipelineAck. 
Contributed by Anu Engineer and Haohui Mai. (wheat9: rev 
b80457158daf0dc712fbe5695625cc17d70d4bb4)
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/protocol/datatransfer/PipelineAck.java
* hadoop-hdfs-project/hadoop-hdfs/src/main/proto/datatransfer.proto
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockReceiver.java
Addendum for HDFS-7748. (wheat9: rev 0967b1d99d7001cd1d09ebd29b9360f1079410e8)
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDataTransferProtocol.java


 Separate ECN flags from the Status in the DataTransferPipelineAck
 -

 Key: HDFS-7748
 URL: https://issues.apache.org/jira/browse/HDFS-7748
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Anu Engineer
Priority: Blocker
 Attachments: HDFS-7748.007-addendum.patch, HDFS-7748.007.patch, 
 hdfs-7748.001.patch, hdfs-7748.002.patch, hdfs-7748.003.patch, 
 hdfs-7748.004.patch, hdfs-7748.005.patch, hdfs-7748.006.patch, 
 hdfs-7748.branch-2.7.006.patch


 Prior to the discussions on HDFS-7270, the old clients might fail to talk to 
 the newer server when ECN is turned on. This jira proposes to separate the 
 ECN flags in a separate protobuf field to make the ack compatible on both 
 versions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7944) Minor cleanup of BlockPoolManager#getAllNamenodeThreads

2015-03-31 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388636#comment-14388636
 ] 

Hudson commented on HDFS-7944:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #140 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/140/])
HDFS-7944. Minor cleanup of BlockPoolManager#getAllNamenodeThreads. (Arpit 
Agarwal) (arp: rev 85dc3c14b2ca4b01a93361bb925c39a22a6fd8db)
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDatanodeProtocolRetryPolicy.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockScanner.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestIncrementalBlockReports.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockRecovery.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDeleteBlockPool.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolManager.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestRefreshNamenodes.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeExit.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMultipleRegistrations.java
* 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestTriggerBlockReport.java


 Minor cleanup of BlockPoolManager#getAllNamenodeThreads
 ---

 Key: HDFS-7944
 URL: https://issues.apache.org/jira/browse/HDFS-7944
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Arpit Agarwal
Assignee: Arpit Agarwal
Priority: Minor
 Fix For: 2.8.0

 Attachments: HDFS-7944.01.patch, HDFS-7944.02.patch


 {{BlockPoolManager#getAllNamenodeThreads}} can avoid unnecessary list to 
 array conversion and vice versa by returning an unmodifiable list.
 Since NN addition/removal is relatively rare we can just use a 
 {{CopyOnWriteArrayList}} for concurrency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7941) hsync() not working

2015-03-31 Thread Sverre Bakke (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sverre Bakke updated HDFS-7941:
---
Description: 
When using SequenceFile.Writer and appending+syncing to file repeatedly, the 
sync does not appear to work other than:
- once after writing headers
- when closing.

Imagine the following test case:
http://pastebin.com/Y9xysCRX

This code would append a new record every second and then immediately sync it. 
One would also imagine that the file would grow for every append, however, this 
does not happen.

After watching the behavior I have noticed that it only syncs the headers at 
the very beginning (providing a file of 164 bytes) and then never again until 
its closed. This despite it is asked to hsync() after every append.

Looking into the debug logs, this also claims the same behavior (executed the 
provided code example and grepped for sync):

SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
2015-03-17 15:55:14 DEBUG ProtobufRpcEngine:253 - Call: fsync took 11ms

This was the only time the code ran fsync throughout the entire execution.

This has been tested (with similar result) for the following deployments:
- sequencefile with no compression
- sequencefile with record compression
- sequencefile with block compression
- textfile with no compression

  was:
When using SequenceFile.Writer and appending+syncing to file repeatedly, the 
sync does not appear to work other than:
- once after writing headers
- when closing.

Imagine the following test case:
http://pastebin.com/Y9xysCRX

This code would append a new record every second and then immediately sync it. 
One would also imagine that the file would grow for every append, however, this 
does not happen.

After watching the behavior I have noticed that it only syncs the headers at 
the very beginning (providing a file of 164 bytes) and then never again until 
its closed. This despite it is asked to hsync() after every append.

Looking into the debug logs, this also claims the same behavior (executed the 
provided code example and grepped for sync):

SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
details.
2015-03-17 15:55:14 DEBUG ProtobufRpcEngine:253 - Call: fsync took 11ms

This was the only time the code ran fsync throughout the entire execution.


 hsync() not working
 ---

 Key: HDFS-7941
 URL: https://issues.apache.org/jira/browse/HDFS-7941
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.6.0
 Environment: HDP 2.2 running on Redhat
Reporter: Sverre Bakke

 When using SequenceFile.Writer and appending+syncing to file repeatedly, the 
 sync does not appear to work other than:
 - once after writing headers
 - when closing.
 Imagine the following test case:
 http://pastebin.com/Y9xysCRX
 This code would append a new record every second and then immediately sync 
 it. One would also imagine that the file would grow for every append, 
 however, this does not happen.
 After watching the behavior I have noticed that it only syncs the headers at 
 the very beginning (providing a file of 164 bytes) and then never again until 
 its closed. This despite it is asked to hsync() after every append.
 Looking into the debug logs, this also claims the same behavior (executed the 
 provided code example and grepped for sync):
 SLF4J: Failed to load class org.slf4j.impl.StaticLoggerBinder.
 SLF4J: Defaulting to no-operation (NOP) logger implementation
 SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
 details.
 2015-03-17 15:55:14 DEBUG ProtobufRpcEngine:253 - Call: fsync took 11ms
 This was the only time the code ran fsync throughout the entire execution.
 This has been tested (with similar result) for the following deployments:
 - sequencefile with no compression
 - sequencefile with record compression
 - sequencefile with block compression
 - textfile with no compression



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-6945) BlockManager should remove a block from excessReplicateMap and decrement ExcessBlocks metric when the block is removed

2015-03-31 Thread Akira AJISAKA (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira AJISAKA updated HDFS-6945:

Attachment: HDFS-6945-005.patch

Thanks [~szetszwo] for the comment. Cleaned up the patch.

 BlockManager should remove a block from excessReplicateMap and decrement 
 ExcessBlocks metric when the block is removed
 --

 Key: HDFS-6945
 URL: https://issues.apache.org/jira/browse/HDFS-6945
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.5.0
Reporter: Akira AJISAKA
Assignee: Akira AJISAKA
Priority: Critical
  Labels: metrics
 Attachments: HDFS-6945-003.patch, HDFS-6945-004.patch, 
 HDFS-6945-005.patch, HDFS-6945.2.patch, HDFS-6945.patch


 I'm seeing ExcessBlocks metric increases to more than 300K in some clusters, 
 however, there are no over-replicated blocks (confirmed by fsck).
 After a further research, I noticed when deleting a block, BlockManager does 
 not remove the block from excessReplicateMap or decrement excessBlocksCount.
 Usually the metric is decremented when processing block report, however, if 
 the block has been deleted, BlockManager does not remove the block from 
 excessReplicateMap or decrement the metric.
 That way the metric and excessReplicateMap can increase infinitely (i.e. 
 memory leak can occur).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7786) Handle slow writers for DFSStripedOutputStream

2015-03-31 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-7786:

Summary: Handle slow writers for DFSStripedOutputStream  (was: Handle slow 
writers for DFSOutputStream when there're multiple data streamers)

 Handle slow writers for DFSStripedOutputStream
 --

 Key: HDFS-7786
 URL: https://issues.apache.org/jira/browse/HDFS-7786
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Fix For: HDFS-7285


 There're multiple data streamers in DFSOutputStream if it is used to write a 
 striping layout file. These streamers may have different write speed, and 
 some may write data very slowly. Some streamers may fail and exit. We need to 
 consider these situations and give reliable handling. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7786) Handle slow writers for DFSStripedOutputStream

2015-03-31 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-7786:

Description: The streamers in DFSStripedOutputStream may have different 
write speed. We need to consider and handle the situation when one or more 
writers begin to write slowly.  (was: There're multiple data streamers in 
DFSOutputStream if it is used to write a striping layout file. These streamers 
may have different write speed, and some may write data very slowly. Some 
streamers may fail and exit. We need to consider these situations and give 
reliable handling. )

 Handle slow writers for DFSStripedOutputStream
 --

 Key: HDFS-7786
 URL: https://issues.apache.org/jira/browse/HDFS-7786
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Fix For: HDFS-7285


 The streamers in DFSStripedOutputStream may have different write speed. We 
 need to consider and handle the situation when one or more writers begin to 
 write slowly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7786) Handle slow writers for DFSStripedOutputStream

2015-03-31 Thread Li Bo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Bo updated HDFS-7786:

Description: The streamers in DFSStripedOutputStream may have different 
write speed. We need to consider and handle the situation when one or more 
streamers begin to write slowly.  (was: The streamers in DFSStripedOutputStream 
may have different write speed. We need to consider and handle the situation 
when one or more writers begin to write slowly.)

 Handle slow writers for DFSStripedOutputStream
 --

 Key: HDFS-7786
 URL: https://issues.apache.org/jira/browse/HDFS-7786
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Li Bo
Assignee: Li Bo
 Fix For: HDFS-7285


 The streamers in DFSStripedOutputStream may have different write speed. We 
 need to consider and handle the situation when one or more streamers begin to 
 write slowly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HDFS-7991) Allow users to skip checkpoint when stopping NameNode

2015-03-31 Thread Allen Wittenauer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388796#comment-14388796
 ] 

Allen Wittenauer edited comment on HDFS-7991 at 3/31/15 4:31 PM:
-

bq. (since the stop command only waits 5s)

This is easily fixed by just increasing the timeout or adding logic other logic 
such as asking if the NN is still alive, etc.

But in any case, it occurred to me this morning that the current code just flat 
out won't work in practice.  The problem is that HADOOP_OPTS has the NN's 
configuration inside it.  So, for example, if a user sets the heap size to 64g, 
then dfsadmin is going to run with a 64g heap as well. Same thing with gc logs 
and any other custom JVM setting.

The code absolutely must shell out another bin/hdfs process to get the proper 
HADOOP_OPTS setting.  I suspect it will actually have to use a subshell plus 
parameter captures so that the environment is clean due to various {{export}} 
statements throughout the code and in a lot of user's *-env.sh files.


was (Author: aw):
bq. (since the stop command only waits 5s)

This is easily fixed by just increasing the timeout or adding logic other logic 
such as asking if the NN is still alive, etc.

But in any case, it occurred to me this morning that the current code just flat 
out won't work in practice.  The problem is that HADOOP_OPTS has the NN's 
configuration inside it.  So, for example, if a user sets the heap size to 64g, 
then dfsadmin is going to run with a 64g heap as well. Same thing with gc logs 
and any other custom JVM setting.

The code absolutely must shell out another bin/hdfs process to get the proper 
HADOOP_OPTS setting.  I suspect it will actually have to use a subshell plus 
captures parameters so that the environment is clean due to various {{export}} 
statements throughout the code and in a lot of user's *-env.sh files.

 Allow users to skip checkpoint when stopping NameNode
 -

 Key: HDFS-7991
 URL: https://issues.apache.org/jira/browse/HDFS-7991
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.0.0
Reporter: Jing Zhao
Assignee: Jing Zhao
 Attachments: HDFS-7991.000.patch, HDFS-7991.001.patch, 
 HDFS-7991.002.patch, HDFS-7991.003.patch


 This is a follow-up jira of HDFS-6353. HDFS-6353 adds the functionality to 
 check if saving namespace is necessary before stopping namenode. As [~kihwal] 
 pointed out in this 
 [comment|https://issues.apache.org/jira/browse/HDFS-6353?focusedCommentId=14380898page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14380898],
  in a secured cluster this new functionality requires the user to be kinit'ed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >