[jira] [Updated] (HDFS-11573) Support rename between different NameNodes in federated HDFS

2018-07-27 Thread zhouyingchao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-11573:

Flags: Patch

> Support rename between different NameNodes in federated HDFS
> 
>
> Key: HDFS-11573
> URL: https://issues.apache.org/jira/browse/HDFS-11573
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: zhouyingchao
>Priority: Major
> Attachments: HDFS-11573-2.6.5-001.patch, HDFS_federation_rename.pdf
>
>
>Federated file system can improve overall scalability by dividing a 
> single namespace into multiple sub-namespaces. Since the divided 
> sub-namespace are held by different namenodes, rename operation between those 
> sub-namespace is forbidden. Due to this restriction, many applications have 
> to be rewritten to work around the issue after migrated to federated file 
> system. Supporting rename between different namenodes would make it much 
> easier to migrate to federated file systems. 
>We have finished a preliminary implementation of this feature in a 2.6 
> branch.  I'll upload a write-up regarding the design in a few days. And then 
> I'll re-org the code against the trunk and upload the patch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10240) Race between close/recoverLease leads to missing block

2018-07-23 Thread zhouyingchao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552368#comment-16552368
 ] 

zhouyingchao commented on HDFS-10240:
-

Hi Wei-Chiu, [~LiJinglun] who figured out the issue together with me has free 
time these days, could you please help to re-assign the issue to him?  Looks 
like I cannot re-assign the issue ...

> Race between close/recoverLease leads to missing block
> --
>
> Key: HDFS-10240
> URL: https://issues.apache.org/jira/browse/HDFS-10240
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: zhouyingchao
>Priority: Major
> Attachments: HDFS-10240 scenarios.jpg, HDFS-10240-001.patch, 
> HDFS-10240.test.patch
>
>
> We got a missing block in our cluster, and logs related to the missing block 
> are as follows:
> 2016-03-28,10:00:06,188 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocateBlock: XX. BP-219149063-10.108.84.25-1446859315800 
> blk_1226490256_153006345{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,205 INFO BlockStateChange: BLOCK* 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
>  recovery started, 
> primary=ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]
> 2016-03-28,10:00:06,205 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File XX has not been closed. Lease 
> recovery is in progress. RecoveryId = 153006357 for block 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,248 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* 
> checkFileProgress: blk_1226490256_153006345{blockUCState=COMMITTED, 
> primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  has not reached minimal replication 1
> 2016-03-28,10:00:06,358 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.53:11402 is added to 
> blk_1226490256_153006345{blockUCState=COMMITTED, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  size 139
> 2016-03-28,10:00:06,441 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.44:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:06,660 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.6.14:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:08,808 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=BP-219149063-10.108.84.25-1446859315800:blk_1226490256_153006345,
>  newgenerationstamp=153006357, newlength=139, newtargets=[10.114.6.14:11402, 
> 10.114.5.53:11402, 10.114.5.44:11402], closeFile=true, deleteBlock=false)
> 2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.6.14:11402 by /10.114.6.14 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> 2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.5.53:11402 by /10.114.5.53 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> 2016-03-28,10:00:08,837 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.5.44:11402 by /10.114.5.44 because 

[jira] [Commented] (HDFS-10240) Race between close/recoverLease leads to missing block

2018-07-22 Thread zhouyingchao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-10240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552301#comment-16552301
 ] 

zhouyingchao commented on HDFS-10240:
-

Thank you, Wei-Chiu. I'd like to work on this issue. I'll add more tests in a 
few days.

> Race between close/recoverLease leads to missing block
> --
>
> Key: HDFS-10240
> URL: https://issues.apache.org/jira/browse/HDFS-10240
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: zhouyingchao
>Priority: Major
> Attachments: HDFS-10240 scenarios.jpg, HDFS-10240-001.patch, 
> HDFS-10240.test.patch
>
>
> We got a missing block in our cluster, and logs related to the missing block 
> are as follows:
> 2016-03-28,10:00:06,188 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocateBlock: XX. BP-219149063-10.108.84.25-1446859315800 
> blk_1226490256_153006345{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,205 INFO BlockStateChange: BLOCK* 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
>  recovery started, 
> primary=ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]
> 2016-03-28,10:00:06,205 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File XX has not been closed. Lease 
> recovery is in progress. RecoveryId = 153006357 for block 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,248 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* 
> checkFileProgress: blk_1226490256_153006345{blockUCState=COMMITTED, 
> primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  has not reached minimal replication 1
> 2016-03-28,10:00:06,358 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.53:11402 is added to 
> blk_1226490256_153006345{blockUCState=COMMITTED, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  size 139
> 2016-03-28,10:00:06,441 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.44:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:06,660 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.6.14:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:08,808 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=BP-219149063-10.108.84.25-1446859315800:blk_1226490256_153006345,
>  newgenerationstamp=153006357, newlength=139, newtargets=[10.114.6.14:11402, 
> 10.114.5.53:11402, 10.114.5.44:11402], closeFile=true, deleteBlock=false)
> 2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.6.14:11402 by /10.114.6.14 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> 2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.5.53:11402 by /10.114.5.53 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> 2016-03-28,10:00:08,837 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.5.44:11402 by /10.114.5.44 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> From 

[jira] [Commented] (HDFS-13700) The process of loading image can be done in a pipeline model

2018-06-26 Thread zhouyingchao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523522#comment-16523522
 ] 

zhouyingchao commented on HDFS-13700:
-

Test the patch against a fsimage of a 70PB 2.4 cluster (200million files and 
300million blocks, the fsimage is around 22GB), the image loading time be 
reduced from 1210 seconds to 739 seconds.

> The process of loading image can be done in a pipeline model
> 
>
> Key: HDFS-13700
> URL: https://issues.apache.org/jira/browse/HDFS-13700
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhouyingchao
>Priority: Major
> Attachments: HDFS-13700-001.patch
>
>
> The process of loading a file system image involves reading inodes section, 
> deserializing inodes, initializing inodes, adding inodes to the global map, 
> reading directories section, adding inodes to their parents' map, cache name 
> etc. These steps can be done in a pipeline model to reduce the total 
> duration. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13700) The process of loading image can be done in a pipeline model

2018-06-26 Thread zhouyingchao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-13700:

Attachment: HDFS-13700-001.patch
Status: Patch Available  (was: Open)

> The process of loading image can be done in a pipeline model
> 
>
> Key: HDFS-13700
> URL: https://issues.apache.org/jira/browse/HDFS-13700
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhouyingchao
>Priority: Major
> Attachments: HDFS-13700-001.patch
>
>
> The process of loading a file system image involves reading inodes section, 
> deserializing inodes, initializing inodes, adding inodes to the global map, 
> reading directories section, adding inodes to their parents' map, cache name 
> etc. These steps can be done in a pipeline model to reduce the total 
> duration. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13700) The process of loading image can be done in a pipeline model

2018-06-26 Thread zhouyingchao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523520#comment-16523520
 ] 

zhouyingchao commented on HDFS-13700:
-

[~brahmareddy], thank you for telling me about HDFS-7784.  I guess it's kind of 
the same optimization as I have thought.  Since I have finished a patch which 
implemented a pipeline model for this kind of stuff, I'd like to post the patch 
here for reference. 

> The process of loading image can be done in a pipeline model
> 
>
> Key: HDFS-13700
> URL: https://issues.apache.org/jira/browse/HDFS-13700
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhouyingchao
>Priority: Major
>
> The process of loading a file system image involves reading inodes section, 
> deserializing inodes, initializing inodes, adding inodes to the global map, 
> reading directories section, adding inodes to their parents' map, cache name 
> etc. These steps can be done in a pipeline model to reduce the total 
> duration. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13700) The process of loading image can be done in a pipeline model

2018-06-26 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-13700:
---

 Summary: The process of loading image can be done in a pipeline 
model
 Key: HDFS-13700
 URL: https://issues.apache.org/jira/browse/HDFS-13700
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: zhouyingchao


The process of loading a file system image involves reading inodes section, 
deserializing inodes, initializing inodes, adding inodes to the global map, 
reading directories section, adding inodes to their parents' map, cache name 
etc. These steps can be done in a pipeline model to reduce the total duration. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13694) Making md5 computing being in parallel with image loading

2018-06-22 Thread zhouyingchao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-13694:

Attachment: HDFS-13694-001.patch
Status: Patch Available  (was: Open)

Test the patch against a fsimage of a 70PB 2.4 cluster (200million files and 
300million blocks), the image loading time be reduced from 1210 seconds to 1105 
seconds.

> Making md5 computing being in parallel with image loading
> -
>
> Key: HDFS-13694
> URL: https://issues.apache.org/jira/browse/HDFS-13694
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhouyingchao
>Priority: Major
> Attachments: HDFS-13694-001.patch
>
>
> During namenode image loading, it firstly compute the md5 and then load the 
> image. Actually these two steps can be in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13694) Making md5 computing being in parallel with image loading

2018-06-22 Thread zhouyingchao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-13694:

Attachment: (was: HDFS-13694-001.patch)

> Making md5 computing being in parallel with image loading
> -
>
> Key: HDFS-13694
> URL: https://issues.apache.org/jira/browse/HDFS-13694
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhouyingchao
>Priority: Major
>
> During namenode image loading, it firstly compute the md5 and then load the 
> image. Actually these two steps can be in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13694) Making md5 computing being in parallel with image loading

2018-06-22 Thread zhouyingchao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-13694:

Attachment: HDFS-13694-001.patch

> Making md5 computing being in parallel with image loading
> -
>
> Key: HDFS-13694
> URL: https://issues.apache.org/jira/browse/HDFS-13694
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: zhouyingchao
>Priority: Major
> Attachments: HDFS-13694-001.patch
>
>
> During namenode image loading, it firstly compute the md5 and then load the 
> image. Actually these two steps can be in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13694) Making md5 computing being in parallel with image loading

2018-06-22 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-13694:
---

 Summary: Making md5 computing being in parallel with image loading
 Key: HDFS-13694
 URL: https://issues.apache.org/jira/browse/HDFS-13694
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: zhouyingchao


During namenode image loading, it firstly compute the md5 and then load the 
image. Actually these two steps can be in parallel.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13693) Remove unnecessary search in INodeDirectory.addChild during image loading

2018-06-21 Thread zhouyingchao (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16519897#comment-16519897
 ] 

zhouyingchao commented on HDFS-13693:
-

Test the patch against a fsimage of a 70PB 2.4 cluster (200million files and 
300million blocks), the image loading time be reduced from 1210 seconds to 1138 
seconds.

> Remove unnecessary search in INodeDirectory.addChild during image loading
> -
>
> Key: HDFS-13693
> URL: https://issues.apache.org/jira/browse/HDFS-13693
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: zhouyingchao
>Priority: Major
> Attachments: HDFS-13693-001.patch
>
>
> In FSImageFormatPBINode.loadINodeDirectorySection, all child INodes are added 
> to their parent INode's map one by one. The adding procedure will search a 
> position in the parent's map and then insert the child to the position. 
> However, during image loading, the search is unnecessary since the insert 
> position should always be at the end of the map given the sequence they are 
> serialized on disk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13693) Remove unnecessary search in INodeDirectory.addChild during image loading

2018-06-21 Thread zhouyingchao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-13693:

Attachment: HDFS-13693-001.patch
Status: Patch Available  (was: Open)

Run all hdfs related unit tests and does not introduce new failures.

> Remove unnecessary search in INodeDirectory.addChild during image loading
> -
>
> Key: HDFS-13693
> URL: https://issues.apache.org/jira/browse/HDFS-13693
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: zhouyingchao
>Priority: Major
> Attachments: HDFS-13693-001.patch
>
>
> In FSImageFormatPBINode.loadINodeDirectorySection, all child INodes are added 
> to their parent INode's map one by one. The adding procedure will search a 
> position in the parent's map and then insert the child to the position. 
> However, during image loading, the search is unnecessary since the insert 
> position should always be at the end of the map given the sequence they are 
> serialized on disk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13693) Remove unnecessary search in INodeDirectory.addChild during image loading

2018-06-21 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-13693:
---

 Summary: Remove unnecessary search in INodeDirectory.addChild 
during image loading
 Key: HDFS-13693
 URL: https://issues.apache.org/jira/browse/HDFS-13693
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namenode
Reporter: zhouyingchao


In FSImageFormatPBINode.loadINodeDirectorySection, all child INodes are added 
to their parent INode's map one by one. The adding procedure will search a 
position in the parent's map and then insert the child to the position. 
However, during image loading, the search is unnecessary since the insert 
position should always be at the end of the map given the sequence they are 
serialized on disk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11573) Support rename between different NameNodes in federated HDFS

2017-06-06 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-11573:

Attachment: HDFS-11573-2.6.5-001.patch

Since we implemented the feature on a 2.6 branch, it is easy to merge the code 
with 2.6 branch. I upload the patch against 2.6.5 branch first. If necessary, I 
can merge the code with the 3.0 branch in future.

> Support rename between different NameNodes in federated HDFS
> 
>
> Key: HDFS-11573
> URL: https://issues.apache.org/jira/browse/HDFS-11573
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: zhouyingchao
> Attachments: HDFS-11573-2.6.5-001.patch, HDFS_federation_rename.pdf
>
>
>Federated file system can improve overall scalability by dividing a 
> single namespace into multiple sub-namespaces. Since the divided 
> sub-namespace are held by different namenodes, rename operation between those 
> sub-namespace is forbidden. Due to this restriction, many applications have 
> to be rewritten to work around the issue after migrated to federated file 
> system. Supporting rename between different namenodes would make it much 
> easier to migrate to federated file systems. 
>We have finished a preliminary implementation of this feature in a 2.6 
> branch.  I'll upload a write-up regarding the design in a few days. And then 
> I'll re-org the code against the trunk and upload the patch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11573) Support rename between different NameNodes in federated HDFS

2017-06-06 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-11573:

Attachment: HDFS_federation_rename.pdf

Upload a write-up regarding the feature.

> Support rename between different NameNodes in federated HDFS
> 
>
> Key: HDFS-11573
> URL: https://issues.apache.org/jira/browse/HDFS-11573
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: zhouyingchao
> Attachments: HDFS_federation_rename.pdf
>
>
>Federated file system can improve overall scalability by dividing a 
> single namespace into multiple sub-namespaces. Since the divided 
> sub-namespace are held by different namenodes, rename operation between those 
> sub-namespace is forbidden. Due to this restriction, many applications have 
> to be rewritten to work around the issue after migrated to federated file 
> system. Supporting rename between different namenodes would make it much 
> easier to migrate to federated file systems. 
>We have finished a preliminary implementation of this feature in a 2.6 
> branch.  I'll upload a write-up regarding the design in a few days. And then 
> I'll re-org the code against the trunk and upload the patch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11895) Committed block should be completed during block report if usable replicas are enough

2017-05-26 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-11895:

Status: Patch Available  (was: Open)

> Committed block should be completed during block report if usable replicas 
> are enough
> -
>
> Key: HDFS-11895
> URL: https://issues.apache.org/jira/browse/HDFS-11895
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Priority: Minor
> Attachments: HDFS-11895-001.patch
>
>
> In a 2.4 HDFS cluster, we found an issue that a completeFile call failed 
> since the file's last block's three replica were in decommissioning state. 
> And finally the decommissioning was stuck. We figured out the issue had been 
> fixed by HDFS-11499. The fix of HDFS-11499 completes a committed block if 
> usable replicas are enough in close/recovery code path.  Besides that, we 
> think we'd better complete a committed block in block report path if usable 
> replicas are engouth.  It helps the condition where a client calls 
> completeFile then exit abnormally (without retry) and all its last block's 
> replica DNs are in decommissioning state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11895) Committed block should be completed during block report if usable replicas are enough

2017-05-26 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-11895:

Attachment: HDFS-11895-001.patch

A patch for the issue.

> Committed block should be completed during block report if usable replicas 
> are enough
> -
>
> Key: HDFS-11895
> URL: https://issues.apache.org/jira/browse/HDFS-11895
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Priority: Minor
> Attachments: HDFS-11895-001.patch
>
>
> In a 2.4 HDFS cluster, we found an issue that a completeFile call failed 
> since the file's last block's three replica were in decommissioning state. 
> And finally the decommissioning was stuck. We figured out the issue had been 
> fixed by HDFS-11499. The fix of HDFS-11499 completes a committed block if 
> usable replicas are enough in close/recovery code path.  Besides that, we 
> think we'd better complete a committed block in block report path if usable 
> replicas are engouth.  It helps the condition where a client calls 
> completeFile then exit abnormally (without retry) and all its last block's 
> replica DNs are in decommissioning state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11895) Committed block should be completed during block report if usable replicas are enough

2017-05-26 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-11895:
---

 Summary: Committed block should be completed during block report 
if usable replicas are enough
 Key: HDFS-11895
 URL: https://issues.apache.org/jira/browse/HDFS-11895
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhouyingchao
Priority: Minor


In a 2.4 HDFS cluster, we found an issue that a completeFile call failed since 
the file's last block's three replica were in decommissioning state. And 
finally the decommissioning was stuck. We figured out the issue had been fixed 
by HDFS-11499. The fix of HDFS-11499 completes a committed block if usable 
replicas are enough in close/recovery code path.  Besides that, we think we'd 
better complete a committed block in block report path if usable replicas are 
engouth.  It helps the condition where a client calls completeFile then exit 
abnormally (without retry) and all its last block's replica DNs are in 
decommissioning state.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11573) Support rename between different NameNodes in federated HDFS

2017-03-24 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-11573:

Description: 
   Federated file system can improve overall scalability by dividing a 
single namespace into multiple sub-namespaces. Since the divided sub-namespace 
are held by different namenodes, rename operation between those sub-namespace 
is forbidden. Due to this restriction, many applications have to be rewritten 
to work around the issue after migrated to federated file system. Supporting 
rename between different namenodes would make it much easier to migrate to 
federated file systems. 

   We have finished a preliminary implementation of this feature in a 2.6 
branch.  I'll upload a write-up regarding the design in a few days. And then 
I'll re-org the code against the trunk and upload the patch.

  was:
   Federated file system can improve overall scalability by dividing a 
single namespace into multiple sub-namespaces. Since the divided sub-namespace 
is held by different namenodes, rename operation between those sub-namespace is 
forbidden. Due to this restriction, many applications have to be rewritten to 
work around the issue after migrated to federated file system. Supporting 
rename between different namenodes would make it much easier to migrate to 
federated file systems. 

   We have finished a preliminary implementation of this feature in a 2.6 
branch.  I'll upload a write-up regarding the design in a few days. And then 
I'll re-org the code against the trunk and upload the patch.


> Support rename between different NameNodes in federated HDFS
> 
>
> Key: HDFS-11573
> URL: https://issues.apache.org/jira/browse/HDFS-11573
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: federation
>Reporter: zhouyingchao
>
>Federated file system can improve overall scalability by dividing a 
> single namespace into multiple sub-namespaces. Since the divided 
> sub-namespace are held by different namenodes, rename operation between those 
> sub-namespace is forbidden. Due to this restriction, many applications have 
> to be rewritten to work around the issue after migrated to federated file 
> system. Supporting rename between different namenodes would make it much 
> easier to migrate to federated file systems. 
>We have finished a preliminary implementation of this feature in a 2.6 
> branch.  I'll upload a write-up regarding the design in a few days. And then 
> I'll re-org the code against the trunk and upload the patch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-11573) Support rename between different NameNodes in federated HDFS

2017-03-24 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-11573:
---

 Summary: Support rename between different NameNodes in federated 
HDFS
 Key: HDFS-11573
 URL: https://issues.apache.org/jira/browse/HDFS-11573
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: federation
Reporter: zhouyingchao


   Federated file system can improve overall scalability by dividing a 
single namespace into multiple sub-namespaces. Since the divided sub-namespace 
is held by different namenodes, rename operation between those sub-namespace is 
forbidden. Due to this restriction, many applications have to be rewritten to 
work around the issue after migrated to federated file system. Supporting 
rename between different namenodes would make it much easier to migrate to 
federated file systems. 

   We have finished a preliminary implementation of this feature in a 2.6 
branch.  I'll upload a write-up regarding the design in a few days. And then 
I'll re-org the code against the trunk and upload the patch.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10989) Cannot get last block length after namenode failover

2016-10-09 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-10989:
---

 Summary: Cannot get last block length after namenode failover
 Key: HDFS-10989
 URL: https://issues.apache.org/jira/browse/HDFS-10989
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhouyingchao


On a 2.4 cluster, access to a file failed since the last block length cannot be 
gotten.  The fsck output of the file at the moment of failure was like this:
/user/X 483600487 bytes, 2 block(s), OPENFORWRITE:  MISSING 1 blocks of 
total size 215165031 B
0. BP-219149063-10.108.84.25-1446859315800:blk_2102504098_1035525341 
len=268435456 repl=3 [10.112.17.43:11402, 10.118.22.46:11402, 
10.118.22.49:11402]
1. 
BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036219054{blockUCState=UNDER_RECOVERY,
 primaryNodeIndex=2, 
replicas=[ReplicaUnderConstruction[[DISK]DS-60be75ad-e4a7-4b1e-b3aa-327c85331d42:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-184a1ce9-655a-4e67-b0cc-29ab9984bd0a:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-6d037ac8-4bcc-4cdc-a803-55b1817e0200:NORMAL|RBW]]}
 len=215165031 MISSING!  Recorded locations [10.114.10.14:11402, 
10.118.29.3:11402, 10.118.22.42:11402]

>From those three data nodes, we found that there were IOException related to 
>the block and there were pipeline recreating events.

We figured out that there was a namenode failover event before the issue 
happened, and there were some updatePipeline calls to the earlier active 
namenode:
2016-09-27,15:04:36,437 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
updatePipeline(block=BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036137092,
 newGenerationStamp=1036170430, newLength=2624000, 
newNodes=[10.118.22.42:11402, 10.118.22.49:11402, 10.118.24.3:11402], 
clientName=DFSClient_NONMAPREDUCE_-442153643_1)
2016-09-27,15:04:36,438 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
updatePipeline(BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036137092)
 successfully to 
BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036170430
2016-09-27,15:10:10,596 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
updatePipeline(block=BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036170430,
 newGenerationStamp=1036219054, newLength=17138265, 
newNodes=[10.118.22.49:11402, 10.118.24.3:11402, 10.114.6.45:11402], 
clientName=DFSClient_NONMAPREDUCE_-442153643_1)
2016-09-27,15:10:10,601 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
updatePipeline(BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036170430)
 successfully to 
BP-219149063-10.108.84.25-1446859315800:blk_2103114087_1036219054

Whereas these new data nodes did not show up in the fsck output. It looks like 
that when data node recovers pipeline (PIPELINE_SETUP_STREAMING_RECOVERY ), the 
new data nodes would not call notifyNamingnodeReceivingBlock for the transfered 
block. 

>From code review, the issue also exists in more recent branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10240) Race between close/recoverLease leads to missing block

2016-03-31 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-10240:

Attachment: HDFS-10240-001.patch

If the patch is ok, I would add some unit tests.

> Race between close/recoverLease leads to missing block
> --
>
> Key: HDFS-10240
> URL: https://issues.apache.org/jira/browse/HDFS-10240
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: zhouyingchao
> Attachments: HDFS-10240-001.patch
>
>
> We got a missing block in our cluster, and logs related to the missing block 
> are as follows:
> 2016-03-28,10:00:06,188 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocateBlock: XX. BP-219149063-10.108.84.25-1446859315800 
> blk_1226490256_153006345{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,205 INFO BlockStateChange: BLOCK* 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
>  recovery started, 
> primary=ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]
> 2016-03-28,10:00:06,205 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File XX has not been closed. Lease 
> recovery is in progress. RecoveryId = 153006357 for block 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,248 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* 
> checkFileProgress: blk_1226490256_153006345{blockUCState=COMMITTED, 
> primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  has not reached minimal replication 1
> 2016-03-28,10:00:06,358 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.53:11402 is added to 
> blk_1226490256_153006345{blockUCState=COMMITTED, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  size 139
> 2016-03-28,10:00:06,441 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.44:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:06,660 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.6.14:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:08,808 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=BP-219149063-10.108.84.25-1446859315800:blk_1226490256_153006345,
>  newgenerationstamp=153006357, newlength=139, newtargets=[10.114.6.14:11402, 
> 10.114.5.53:11402, 10.114.5.44:11402], closeFile=true, deleteBlock=false)
> 2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.6.14:11402 by /10.114.6.14 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> 2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.5.53:11402 by /10.114.5.53 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> 2016-03-28,10:00:08,837 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.5.44:11402 by /10.114.5.44 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> From the log, I guess this is what has happened in order:
> 1  Process A open a file F for write.
> 2  Somebody else called 

[jira] [Updated] (HDFS-10240) Race between close/recoverLease leads to missing block

2016-03-31 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-10240:

Status: Patch Available  (was: Open)

> Race between close/recoverLease leads to missing block
> --
>
> Key: HDFS-10240
> URL: https://issues.apache.org/jira/browse/HDFS-10240
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: zhouyingchao
>
> We got a missing block in our cluster, and logs related to the missing block 
> are as follows:
> 2016-03-28,10:00:06,188 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocateBlock: XX. BP-219149063-10.108.84.25-1446859315800 
> blk_1226490256_153006345{blockUCState=UNDER_CONSTRUCTION, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,205 INFO BlockStateChange: BLOCK* 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
>  recovery started, 
> primary=ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]
> 2016-03-28,10:00:06,205 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.internalReleaseLease: File XX has not been closed. Lease 
> recovery is in progress. RecoveryId = 153006357 for block 
> blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
> 2016-03-28,10:00:06,248 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* 
> checkFileProgress: blk_1226490256_153006345{blockUCState=COMMITTED, 
> primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  has not reached minimal replication 1
> 2016-03-28,10:00:06,358 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.53:11402 is added to 
> blk_1226490256_153006345{blockUCState=COMMITTED, primaryNodeIndex=2, 
> replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
>  
> ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
>  size 139
> 2016-03-28,10:00:06,441 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.5.44:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:06,660 INFO BlockStateChange: BLOCK* addStoredBlock: 
> blockMap updated: 10.114.6.14:11402 is added to blk_1226490256_153006345 size 
> 139
> 2016-03-28,10:00:08,808 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=BP-219149063-10.108.84.25-1446859315800:blk_1226490256_153006345,
>  newgenerationstamp=153006357, newlength=139, newtargets=[10.114.6.14:11402, 
> 10.114.5.53:11402, 10.114.5.44:11402], closeFile=true, deleteBlock=false)
> 2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.6.14:11402 by /10.114.6.14 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> 2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.5.53:11402 by /10.114.5.53 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> 2016-03-28,10:00:08,837 INFO BlockStateChange: BLOCK 
> NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
> 10.114.5.44:11402 by /10.114.5.44 because block is COMPLETE and reported 
> genstamp 153006357 does not match genstamp in block map 153006345
> From the log, I guess this is what has happened in order:
> 1  Process A open a file F for write.
> 2  Somebody else called recoverLease against F.
> 3  A closed F.
> The root cause of the missing block is that 

[jira] [Created] (HDFS-10240) Race between close/recoverLease leads to missing block

2016-03-31 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-10240:
---

 Summary: Race between close/recoverLease leads to missing block
 Key: HDFS-10240
 URL: https://issues.apache.org/jira/browse/HDFS-10240
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhouyingchao
Assignee: zhouyingchao


We got a missing block in our cluster, and logs related to the missing block 
are as follows:
2016-03-28,10:00:06,188 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
allocateBlock: XX. BP-219149063-10.108.84.25-1446859315800 
blk_1226490256_153006345{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, 
replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
2016-03-28,10:00:06,205 INFO BlockStateChange: BLOCK* 
blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
 recovery started, 
primary=ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]
2016-03-28,10:00:06,205 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.internalReleaseLease: File XX has not been closed. Lease 
recovery is in progress. RecoveryId = 153006357 for block 
blk_1226490256_153006345{blockUCState=UNDER_RECOVERY, primaryNodeIndex=2, 
replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-3f5032bc-6006-4fcc-b0f7-b355a5b94f1b:NORMAL|RBW]]}
2016-03-28,10:00:06,248 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* checkFileProgress: 
blk_1226490256_153006345{blockUCState=COMMITTED, primaryNodeIndex=2, 
replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
 has not reached minimal replication 1
2016-03-28,10:00:06,358 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.114.5.53:11402 is added to 
blk_1226490256_153006345{blockUCState=COMMITTED, primaryNodeIndex=2, 
replicas=[ReplicaUnderConstruction[[DISK]DS-bcd22774-cf4d-45e9-a6a6-c475181271c9:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-ec1413ae-5541-4b44-8922-c928be3bb306:NORMAL|RBW],
 
ReplicaUnderConstruction[[DISK]DS-85819f0d-bdbb-4a9b-b90c-eba078547c23:NORMAL|RBW]]}
 size 139
2016-03-28,10:00:06,441 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.114.5.44:11402 is added to blk_1226490256_153006345 size 139
2016-03-28,10:00:06,660 INFO BlockStateChange: BLOCK* addStoredBlock: blockMap 
updated: 10.114.6.14:11402 is added to blk_1226490256_153006345 size 139
2016-03-28,10:00:08,808 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: 
commitBlockSynchronization(lastblock=BP-219149063-10.108.84.25-1446859315800:blk_1226490256_153006345,
 newgenerationstamp=153006357, newlength=139, newtargets=[10.114.6.14:11402, 
10.114.5.53:11402, 10.114.5.44:11402], closeFile=true, deleteBlock=false)
2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
10.114.6.14:11402 by /10.114.6.14 because block is COMPLETE and reported 
genstamp 153006357 does not match genstamp in block map 153006345
2016-03-28,10:00:08,836 INFO BlockStateChange: BLOCK 
NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
10.114.5.53:11402 by /10.114.5.53 because block is COMPLETE and reported 
genstamp 153006357 does not match genstamp in block map 153006345
2016-03-28,10:00:08,837 INFO BlockStateChange: BLOCK 
NameSystem.addToCorruptReplicasMap: blk_1226490256 added as corrupt on 
10.114.5.44:11402 by /10.114.5.44 because block is COMPLETE and reported 
genstamp 153006357 does not match genstamp in block map 153006345

>From the log, I guess this is what has happened in order:
1  Process A open a file F for write.
2  Somebody else called recoverLease against F.
3  A closed F.

The root cause of the missing block is that recoverLease increased gen count of 
blocks on datanode whereas the gen count on Namenode is reset by close in step 
3. I think we should check if the last block is under recovery when trying to 
close a file. If so we should just throw an exception to client quickly.

Although the issue is found on a 2.4 HDFS, it looks like the issue also exist 
on the trunk from code. 

[jira] [Commented] (HDFS-8496) Calling stopWriter() with FSDatasetImpl lock held may block other threads

2016-03-31 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15219483#comment-15219483
 ] 

zhouyingchao commented on HDFS-8496:


Thank you, Colin!  The patch looks good to me.

> Calling stopWriter() with FSDatasetImpl lock held may block other threads
> -
>
> Key: HDFS-8496
> URL: https://issues.apache.org/jira/browse/HDFS-8496
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.6.0
>Reporter: zhouyingchao
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-8496-001.patch, HDFS-8496.002.patch
>
>
> On a DN of a HDFS 2.6 cluster, we noticed some DataXceiver threads and  
> heartbeat threads are blocked for quite a while on the FSDatasetImpl lock. By 
> looking at the stack, we found the calling of stopWriter() with FSDatasetImpl 
> lock blocked everything.
> Following is the heartbeat stack, as an example, to show how threads are 
> blocked by FSDatasetImpl lock:
> {code}
>java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
> - waiting to lock <0x0007701badc0> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getAvailable(FsVolumeImpl.java:191)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
> - locked <0x000770465dc0> (a java.lang.Object)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
> at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The thread which held the FSDatasetImpl lock is just sleeping to wait another 
> thread to exit in stopWriter(). The stack is:
> {code}
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Thread.join(Thread.java:1194)
> - locked <0x0007636953b8> (a org.apache.hadoop.util.Daemon)
> at 
> org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverCheck(FsDatasetImpl.java:982)
> at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverClose(FsDatasetImpl.java:1026)
> - locked <0x0007701badc0> (a 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:624)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
> at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
> at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> In this case, we deployed quite a lot other workloads on the DN, the local 
> file system and disk is quite busy. We guess this is why the stopWriter took 
> quite a long time.
> Any way, it is not quite reasonable to call stopWriter with the FSDatasetImpl 
> lock held.   In HDFS-7999, the createTemporary() is changed to call 
> stopWriter without FSDatasetImpl lock. We guess we should do so in the other 
> three methods: recoverClose()/recoverAppend/recoverRbw().
> I'll try to finish a patch for this today. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10182) Hedged read might overwrite user's buf

2016-03-28 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-10182:

Attachment: HDFS-10182-branch26.patch

Patch of branch-2.6

> Hedged read might overwrite user's buf
> --
>
> Key: HDFS-10182
> URL: https://issues.apache.org/jira/browse/HDFS-10182
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: zhouyingchao
> Fix For: 2.7.3
>
> Attachments: HDFS-10182-001.patch, HDFS-10182-branch26.patch
>
>
> In DFSInputStream::hedgedFetchBlockByteRange, during the first attempt, the 
> passed-in buf from the caller is passed to another thread to fill.  If the 
> first attempt is timed out, the second attempt would be issued with another 
> temp ByteBuffer. Now  suppose the second attempt wins and the first attempt 
> is blocked somewhere in the IO path. The second attempt's result would be 
> copied to the buf provided by the caller and then caller would think the 
> pread is all set. Later the caller might use the buf to do something else 
> (for e.g. read another chunk of data), however, the first attempt in earlier 
> hedgedFetchBlockByteRange might get some data and fill into the buf ... 
> If this happens, the caller's buf would then be corrupted.
> To fix the issue, we should allocate a temp buf for the first attempt too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-10182) Hedged read might overwrite user's buf

2016-03-28 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15214159#comment-15214159
 ] 

zhouyingchao commented on HDFS-10182:
-

Thank you for picking up the fix. I'll upload a patch against 2.6 ASAP.

> Hedged read might overwrite user's buf
> --
>
> Key: HDFS-10182
> URL: https://issues.apache.org/jira/browse/HDFS-10182
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: zhouyingchao
> Fix For: 2.7.3
>
> Attachments: HDFS-10182-001.patch
>
>
> In DFSInputStream::hedgedFetchBlockByteRange, during the first attempt, the 
> passed-in buf from the caller is passed to another thread to fill.  If the 
> first attempt is timed out, the second attempt would be issued with another 
> temp ByteBuffer. Now  suppose the second attempt wins and the first attempt 
> is blocked somewhere in the IO path. The second attempt's result would be 
> copied to the buf provided by the caller and then caller would think the 
> pread is all set. Later the caller might use the buf to do something else 
> (for e.g. read another chunk of data), however, the first attempt in earlier 
> hedgedFetchBlockByteRange might get some data and fill into the buf ... 
> If this happens, the caller's buf would then be corrupted.
> To fix the issue, we should allocate a temp buf for the first attempt too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10182) Hedged read might overwrite user's buf

2016-03-20 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-10182:

Status: Patch Available  (was: Open)

> Hedged read might overwrite user's buf
> --
>
> Key: HDFS-10182
> URL: https://issues.apache.org/jira/browse/HDFS-10182
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: zhouyingchao
> Attachments: HDFS-10182-001.patch
>
>
> In DFSInputStream::hedgedFetchBlockByteRange, the passed-in buf from the 
> caller is passed to another thread to fill in the first attempt.  If the 
> first attempt is timed out, the second attempt would be issued with another 
> ByteBuffer. Now  suppose the second attempt wins and the first attempt is 
> blocked somewhere in the IO path. The second attempt's result would be copied 
> to the buf provided by the caller and then caller would think the pread is 
> all set. Later the caller might use the buf to do something else (for e.g. 
> read another chunk of data), however, the first attempt in earlier 
> hedgedFetchBlockByteRange might get some data and fill into the buf ... 
> If this happens, the caller's buf would then be corrupted.
> To fix the issue, we should allocate a temp buf for the first attempt too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10182) Hedged read might overwrite user's buf

2016-03-19 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-10182:

Description: 
In DFSInputStream::hedgedFetchBlockByteRange, during the first attempt, the 
passed-in buf from the caller is passed to another thread to fill.  If the 
first attempt is timed out, the second attempt would be issued with another 
ByteBuffer. Now  suppose the second attempt wins and the first attempt is 
blocked somewhere in the IO path. The second attempt's result would be copied 
to the buf provided by the caller and then caller would think the pread is all 
set. Later the caller might use the buf to do something else (for e.g. read 
another chunk of data), however, the first attempt in earlier 
hedgedFetchBlockByteRange might get some data and fill into the buf ... 
If this happens, the caller's buf would then be corrupted.

To fix the issue, we should allocate a temp buf for the first attempt too.

  was:
In DFSInputStream::hedgedFetchBlockByteRange, the passed-in buf from the caller 
is passed to another thread to fill in the first attempt.  If the first attempt 
is timed out, the second attempt would be issued with another ByteBuffer. Now  
suppose the second attempt wins and the first attempt is blocked somewhere in 
the IO path. The second attempt's result would be copied to the buf provided by 
the caller and then caller would think the pread is all set. Later the caller 
might use the buf to do something else (for e.g. read another chunk of data), 
however, the first attempt in earlier hedgedFetchBlockByteRange might get some 
data and fill into the buf ... 
If this happens, the caller's buf would then be corrupted.

To fix the issue, we should allocate a temp buf for the first attempt too.


> Hedged read might overwrite user's buf
> --
>
> Key: HDFS-10182
> URL: https://issues.apache.org/jira/browse/HDFS-10182
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: zhouyingchao
> Attachments: HDFS-10182-001.patch
>
>
> In DFSInputStream::hedgedFetchBlockByteRange, during the first attempt, the 
> passed-in buf from the caller is passed to another thread to fill.  If the 
> first attempt is timed out, the second attempt would be issued with another 
> ByteBuffer. Now  suppose the second attempt wins and the first attempt is 
> blocked somewhere in the IO path. The second attempt's result would be copied 
> to the buf provided by the caller and then caller would think the pread is 
> all set. Later the caller might use the buf to do something else (for e.g. 
> read another chunk of data), however, the first attempt in earlier 
> hedgedFetchBlockByteRange might get some data and fill into the buf ... 
> If this happens, the caller's buf would then be corrupted.
> To fix the issue, we should allocate a temp buf for the first attempt too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10182) Hedged read might overwrite user's buf

2016-03-19 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-10182:

Description: 
In DFSInputStream::hedgedFetchBlockByteRange, during the first attempt, the 
passed-in buf from the caller is passed to another thread to fill.  If the 
first attempt is timed out, the second attempt would be issued with another 
temp ByteBuffer. Now  suppose the second attempt wins and the first attempt is 
blocked somewhere in the IO path. The second attempt's result would be copied 
to the buf provided by the caller and then caller would think the pread is all 
set. Later the caller might use the buf to do something else (for e.g. read 
another chunk of data), however, the first attempt in earlier 
hedgedFetchBlockByteRange might get some data and fill into the buf ... 
If this happens, the caller's buf would then be corrupted.

To fix the issue, we should allocate a temp buf for the first attempt too.

  was:
In DFSInputStream::hedgedFetchBlockByteRange, during the first attempt, the 
passed-in buf from the caller is passed to another thread to fill.  If the 
first attempt is timed out, the second attempt would be issued with another 
ByteBuffer. Now  suppose the second attempt wins and the first attempt is 
blocked somewhere in the IO path. The second attempt's result would be copied 
to the buf provided by the caller and then caller would think the pread is all 
set. Later the caller might use the buf to do something else (for e.g. read 
another chunk of data), however, the first attempt in earlier 
hedgedFetchBlockByteRange might get some data and fill into the buf ... 
If this happens, the caller's buf would then be corrupted.

To fix the issue, we should allocate a temp buf for the first attempt too.


> Hedged read might overwrite user's buf
> --
>
> Key: HDFS-10182
> URL: https://issues.apache.org/jira/browse/HDFS-10182
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: zhouyingchao
> Attachments: HDFS-10182-001.patch
>
>
> In DFSInputStream::hedgedFetchBlockByteRange, during the first attempt, the 
> passed-in buf from the caller is passed to another thread to fill.  If the 
> first attempt is timed out, the second attempt would be issued with another 
> temp ByteBuffer. Now  suppose the second attempt wins and the first attempt 
> is blocked somewhere in the IO path. The second attempt's result would be 
> copied to the buf provided by the caller and then caller would think the 
> pread is all set. Later the caller might use the buf to do something else 
> (for e.g. read another chunk of data), however, the first attempt in earlier 
> hedgedFetchBlockByteRange might get some data and fill into the buf ... 
> If this happens, the caller's buf would then be corrupted.
> To fix the issue, we should allocate a temp buf for the first attempt too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-10182) Hedged read might overwrite user's buf

2016-03-19 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-10182:
---

 Summary: Hedged read might overwrite user's buf
 Key: HDFS-10182
 URL: https://issues.apache.org/jira/browse/HDFS-10182
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhouyingchao
Assignee: zhouyingchao


In DFSInputStream::hedgedFetchBlockByteRange, the passed-in buf from the caller 
is passed to another thread to fill in the first attempt.  If the first attempt 
is timed out, the second attempt would be issued with another ByteBuffer. Now  
suppose the second attempt wins and the first attempt is blocked somewhere in 
the IO path. The second attempt's result would be copied to the buf provided by 
the caller and then caller would think the pread is all set. Later the caller 
might use the buf to do something else (for e.g. read another chunk of data), 
however, the first attempt in earlier hedgedFetchBlockByteRange might get some 
data and fill into the buf ... 
If this happens, the caller's buf would then be corrupted.

To fix the issue, we should allocate a temp buf for the first attempt too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-10182) Hedged read might overwrite user's buf

2016-03-18 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-10182:

Attachment: HDFS-10182-001.patch

> Hedged read might overwrite user's buf
> --
>
> Key: HDFS-10182
> URL: https://issues.apache.org/jira/browse/HDFS-10182
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: zhouyingchao
>Assignee: zhouyingchao
> Attachments: HDFS-10182-001.patch
>
>
> In DFSInputStream::hedgedFetchBlockByteRange, the passed-in buf from the 
> caller is passed to another thread to fill in the first attempt.  If the 
> first attempt is timed out, the second attempt would be issued with another 
> ByteBuffer. Now  suppose the second attempt wins and the first attempt is 
> blocked somewhere in the IO path. The second attempt's result would be copied 
> to the buf provided by the caller and then caller would think the pread is 
> all set. Later the caller might use the buf to do something else (for e.g. 
> read another chunk of data), however, the first attempt in earlier 
> hedgedFetchBlockByteRange might get some data and fill into the buf ... 
> If this happens, the caller's buf would then be corrupted.
> To fix the issue, we should allocate a temp buf for the first attempt too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8649) Default ACL is not inherited if directory is generated by FileSystem.create interface

2015-06-23 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-8649:
--

 Summary: Default ACL is not inherited if directory is generated by 
FileSystem.create interface
 Key: HDFS-8649
 URL: https://issues.apache.org/jira/browse/HDFS-8649
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhouyingchao
Assignee: zhouyingchao


I have a directory /acltest/t, whose acl is as following:
{code}
# file: /acltest/t
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
group::rwx
mask::rwx
other::---
default:user::rwx
default:group::rwx
default:mask::rwx
default:other::rwx
{code}

My program create a file /acltest/t/a/b using the FileSystem.create interface. 
The acl of directory /acltest/t/a is as following:
{code}
# file: /acltest/t/a
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
group::rwx
mask::rwx
other::---
default:user::rwx
default:group::rwx
default:mask::rwx
default:other::rwx
{code}

As you can see, the child directory b did not inherit its parent's default 
acl for other.

By looking into the implementation, the FileSystem.create interface will 
automatically create non-existing entries in the path, it is done by calling 
FSNamesystem.mkdirsRecursively and hard-coded the third param 
(inheritPermission) as true. In FSNamesystem.mkdirsRecursively, when 
inheritPermission is true, the parent's real permission (rather than 
calculation from default acl) would be used as the new directory's permission.

Is this behavior correct?  The default acl is not worked as people expected. It 
kind of render many access issues in our setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8649) Default ACL is not inherited if directory is generated by FileSystem.create interface

2015-06-23 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598752#comment-14598752
 ] 

zhouyingchao commented on HDFS-8649:


[~cnauroth] Any comments ?

 Default ACL is not inherited if directory is generated by FileSystem.create 
 interface
 -

 Key: HDFS-8649
 URL: https://issues.apache.org/jira/browse/HDFS-8649
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhouyingchao
Assignee: zhouyingchao

 I have a directory /acltest/t, whose acl is as following:
 {code}
 # file: /acltest/t
 # owner: hdfs_tst_admin
 # group: supergroup
 user::rwx
 group::rwx
 mask::rwx
 other::---
 default:user::rwx
 default:group::rwx
 default:mask::rwx
 default:other::rwx
 {code}
 My program create a file /acltest/t/a/b using the FileSystem.create 
 interface. The acl of directory /acltest/t/a is as following:
 {code}
 # file: /acltest/t/a
 # owner: hdfs_tst_admin
 # group: supergroup
 user::rwx
 group::rwx
 mask::rwx
 other::---
 default:user::rwx
 default:group::rwx
 default:mask::rwx
 default:other::rwx
 {code}
 As you can see, the child directory b did not inherit its parent's default 
 acl for other.
 By looking into the implementation, the FileSystem.create interface will 
 automatically create non-existing entries in the path, it is done by calling 
 FSNamesystem.mkdirsRecursively and hard-coded the third param 
 (inheritPermission) as true. In FSNamesystem.mkdirsRecursively, when 
 inheritPermission is true, the parent's real permission (rather than 
 calculation from default acl) would be used as the new directory's permission.
 Is this behavior correct?  The default acl is not worked as people expected. 
 It kind of render many access issues in our setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8496) Calling stopWriter() with FSDatasetImpl lock held may block other threads

2015-06-23 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14598757#comment-14598757
 ] 

zhouyingchao commented on HDFS-8496:


[~cmccabe], Any comments?

 Calling stopWriter() with FSDatasetImpl lock held may  block other threads
 --

 Key: HDFS-8496
 URL: https://issues.apache.org/jira/browse/HDFS-8496
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8496-001.patch


 On a DN of a HDFS 2.6 cluster, we noticed some DataXceiver threads and  
 heartbeat threads are blocked for quite a while on the FSDatasetImpl lock. By 
 looking at the stack, we found the calling of stopWriter() with FSDatasetImpl 
 lock blocked everything.
 Following is the heartbeat stack, as an example, to show how threads are 
 blocked by FSDatasetImpl lock:
 {code}
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007701badc0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getAvailable(FsVolumeImpl.java:191)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x000770465dc0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The thread which held the FSDatasetImpl lock is just sleeping to wait another 
 thread to exit in stopWriter(). The stack is:
 {code}
java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 - locked 0x0007636953b8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverCheck(FsDatasetImpl.java:982)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverClose(FsDatasetImpl.java:1026)
 - locked 0x0007701badc0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:624)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 In this case, we deployed quite a lot other workloads on the DN, the local 
 file system and disk is quite busy. We guess this is why the stopWriter took 
 quite a long time.
 Any way, it is not quite reasonable to call stopWriter with the FSDatasetImpl 
 lock held.   In HDFS-7999, the createTemporary() is changed to call 
 stopWriter without FSDatasetImpl lock. We guess we should do so in the other 
 three methods: recoverClose()/recoverAppend/recoverRbw().
 I'll try to finish a patch for this today. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8496) Calling stopWriter() with FSDatasetImpl lock held may block other threads

2015-05-29 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8496:
---
Status: Patch Available  (was: Open)

Run all hdfs unit tests without introducing new failure.

 Calling stopWriter() with FSDatasetImpl lock held may  block other threads
 --

 Key: HDFS-8496
 URL: https://issues.apache.org/jira/browse/HDFS-8496
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 On a DN of a HDFS 2.6 cluster, we noticed some DataXceiver threads and  
 heartbeat threads are blocked for quite a while on the FSDatasetImpl lock. By 
 looking at the stack, we found the calling of stopWriter() with FSDatasetImpl 
 lock blocked everything.
 Following is the heartbeat stack, as an example, to show how threads are 
 blocked by FSDatasetImpl lock:
 {code}
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007701badc0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getAvailable(FsVolumeImpl.java:191)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x000770465dc0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The thread which held the FSDatasetImpl lock is just sleeping to wait another 
 thread to exit in stopWriter(). The stack is:
 {code}
java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 - locked 0x0007636953b8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverCheck(FsDatasetImpl.java:982)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverClose(FsDatasetImpl.java:1026)
 - locked 0x0007701badc0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:624)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 In this case, we deployed quite a lot other workloads on the DN, the local 
 file system and disk is quite busy. We guess this is why the stopWriter took 
 quite a long time.
 Any way, it is not quite reasonable to call stopWriter with the FSDatasetImpl 
 lock held.   In HDFS-7999, the createTemporary() is changed to call 
 stopWriter without FSDatasetImpl lock. We guess we should do so in the other 
 three methods: recoverClose()/recoverAppend/recoverRbw().
 I'll try to finish a patch for this today. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8496) Calling stopWriter() with FSDatasetImpl lock held may block other threads

2015-05-29 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8496:
---
Attachment: HDFS-8496-001.patch

 Calling stopWriter() with FSDatasetImpl lock held may  block other threads
 --

 Key: HDFS-8496
 URL: https://issues.apache.org/jira/browse/HDFS-8496
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8496-001.patch


 On a DN of a HDFS 2.6 cluster, we noticed some DataXceiver threads and  
 heartbeat threads are blocked for quite a while on the FSDatasetImpl lock. By 
 looking at the stack, we found the calling of stopWriter() with FSDatasetImpl 
 lock blocked everything.
 Following is the heartbeat stack, as an example, to show how threads are 
 blocked by FSDatasetImpl lock:
 {code}
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007701badc0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getAvailable(FsVolumeImpl.java:191)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x000770465dc0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The thread which held the FSDatasetImpl lock is just sleeping to wait another 
 thread to exit in stopWriter(). The stack is:
 {code}
java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 - locked 0x0007636953b8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverCheck(FsDatasetImpl.java:982)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverClose(FsDatasetImpl.java:1026)
 - locked 0x0007701badc0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:624)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 In this case, we deployed quite a lot other workloads on the DN, the local 
 file system and disk is quite busy. We guess this is why the stopWriter took 
 quite a long time.
 Any way, it is not quite reasonable to call stopWriter with the FSDatasetImpl 
 lock held.   In HDFS-7999, the createTemporary() is changed to call 
 stopWriter without FSDatasetImpl lock. We guess we should do so in the other 
 three methods: recoverClose()/recoverAppend/recoverRbw().
 I'll try to finish a patch for this today. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8496) Calling stopWriter() with FSDatasetImpl lock held may block other threads

2015-05-28 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-8496:
--

 Summary: Calling stopWriter() with FSDatasetImpl lock held may  
block other threads
 Key: HDFS-8496
 URL: https://issues.apache.org/jira/browse/HDFS-8496
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao


On a DN of a HDFS 2.6 cluster, we noticed some DataXceiver threads and  
heartbeat threads are blocked for quite a while on the FSDatasetImpl lock. By 
looking at the stack, we found the calling of stopWriter() with FSDatasetImpl 
lock blocked everything.

Following is the heartbeat stack, as an example, to show how threads are 
blocked by FSDatasetImpl lock:
{code}
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
- waiting to lock 0x0007701badc0 (a 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getAvailable(FsVolumeImpl.java:191)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
- locked 0x000770465dc0 (a java.lang.Object)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
at java.lang.Thread.run(Thread.java:662)
{code}

The thread which held the FSDatasetImpl lock is just sleeping to wait another 
thread to exit in stopWriter(). The stack is:
{code}
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1194)
- locked 0x0007636953b8 (a org.apache.hadoop.util.Daemon)
at 
org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverCheck(FsDatasetImpl.java:982)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.recoverClose(FsDatasetImpl.java:1026)
- locked 0x0007701badc0 (a 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:624)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
at java.lang.Thread.run(Thread.java:662)
{code}

In this case, we deployed quite a lot other workloads on the DN, the local file 
system and disk is quite busy. We guess this is why the stopWriter took quite a 
long time.
Any way, it is not quite reasonable to call stopWriter with the FSDatasetImpl 
lock held.   In HDFS-7999, the createTemporary() is changed to call stopWriter 
without FSDatasetImpl lock. We guess we should do so in the other three 
methods: recoverClose()/recoverAppend/recoverRbw().

I'll try to finish a patch for this today. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8429) The DomainSocketWatcher thread should not block other threads if it dies

2015-05-26 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560303#comment-14560303
 ] 

zhouyingchao commented on HDFS-8429:


Colin, thank you for pointing out this issue.  I've changed and uploaded the 
patch accordingly.

 The DomainSocketWatcher thread should not block other threads if it dies
 

 Key: HDFS-8429
 URL: https://issues.apache.org/jira/browse/HDFS-8429
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8429-001.patch, HDFS-8429-002.patch, 
 HDFS-8429-003.patch


 In our cluster, an application is hung when doing a short circuit read of 
 local hdfs block. By looking into the log, we found the DataNode's 
 DomainSocketWatcher.watcherThread has exited with following log:
 {code}
 ERROR org.apache.hadoop.net.unix.DomainSocketWatcher: 
 Thread[Thread-25,5,main] terminating on unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:463)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The line 463 is following code snippet:
 {code}
  try {
 for (int fd : fdSet.getAndClearReadableFds()) {
   sendCallbackAndRemove(getAndClearReadableFds, entries, fdSet,
 fd);
 }
 {code}
 getAndClearReadableFds is a native method which will malloc an int array. 
 Since our memory is very tight, it looks like the malloc failed and a NULL 
 pointer is returned.
 The bad thing is that other threads then blocked in stack like this:
 {code}
 DataXceiver for client 
 unix:/home/work/app/hdfs/c3prc-micloud/datanode/dn_socket [Waiting for 
 operation #1] daemon prio=10 tid=0x7f0c9c086d90 nid=0x8fc3 waiting on 
 condition [0x7f09b9856000]
java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  0x0007b0174808 (a 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:323)
 at 
 org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:322)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:403)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:214)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:95)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 IMO, we should exit the DN so that the users can know that something go  
 wrong  and fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8429) The DomainSocketWatcher thread should not block other threads if it dies

2015-05-26 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8429:
---
Attachment: HDFS-8429-003.patch

Tested cases include TestParallelShortCircuitLegacyRead, 
TestParallelShortCircuitRead, TestParallelShortCircuitReadNoChecksum, 
TestParallelShortCircuitReadUnCached, TestShortCircuitCache, 
TestShortCircuitLocalRead, TestShortCircuitShm, TemporarySocketDirectory, 
TestDomainSocket, TestDomainSocketWatcher

 The DomainSocketWatcher thread should not block other threads if it dies
 

 Key: HDFS-8429
 URL: https://issues.apache.org/jira/browse/HDFS-8429
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8429-001.patch, HDFS-8429-002.patch, 
 HDFS-8429-003.patch


 In our cluster, an application is hung when doing a short circuit read of 
 local hdfs block. By looking into the log, we found the DataNode's 
 DomainSocketWatcher.watcherThread has exited with following log:
 {code}
 ERROR org.apache.hadoop.net.unix.DomainSocketWatcher: 
 Thread[Thread-25,5,main] terminating on unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:463)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The line 463 is following code snippet:
 {code}
  try {
 for (int fd : fdSet.getAndClearReadableFds()) {
   sendCallbackAndRemove(getAndClearReadableFds, entries, fdSet,
 fd);
 }
 {code}
 getAndClearReadableFds is a native method which will malloc an int array. 
 Since our memory is very tight, it looks like the malloc failed and a NULL 
 pointer is returned.
 The bad thing is that other threads then blocked in stack like this:
 {code}
 DataXceiver for client 
 unix:/home/work/app/hdfs/c3prc-micloud/datanode/dn_socket [Waiting for 
 operation #1] daemon prio=10 tid=0x7f0c9c086d90 nid=0x8fc3 waiting on 
 condition [0x7f09b9856000]
java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  0x0007b0174808 (a 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:323)
 at 
 org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:322)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:403)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:214)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:95)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 IMO, we should exit the DN so that the users can know that something go  
 wrong  and fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8429) Death of watcherThread making other local read blocked

2015-05-21 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8429:
---
Attachment: HDFS-8429-002.patch

Tested cases include TestParallelShortCircuitLegacyRead, 
TestParallelShortCircuitRead, TestParallelShortCircuitReadNoChecksum, 
TestParallelShortCircuitReadUnCached, TestShortCircuitCache, 
TestShortCircuitLocalRead, TestShortCircuitShm, TemporarySocketDirectory, 
TestDomainSocket, TestDomainSocketWatcher

 Death of watcherThread making other local read blocked
 --

 Key: HDFS-8429
 URL: https://issues.apache.org/jira/browse/HDFS-8429
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8429-001.patch, HDFS-8429-002.patch


 In our cluster, an application is hung when doing a short circuit read of 
 local hdfs block. By looking into the log, we found the DataNode's 
 DomainSocketWatcher.watcherThread has exited with following log:
 {code}
 ERROR org.apache.hadoop.net.unix.DomainSocketWatcher: 
 Thread[Thread-25,5,main] terminating on unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:463)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The line 463 is following code snippet:
 {code}
  try {
 for (int fd : fdSet.getAndClearReadableFds()) {
   sendCallbackAndRemove(getAndClearReadableFds, entries, fdSet,
 fd);
 }
 {code}
 getAndClearReadableFds is a native method which will malloc an int array. 
 Since our memory is very tight, it looks like the malloc failed and a NULL 
 pointer is returned.
 The bad thing is that other threads then blocked in stack like this:
 {code}
 DataXceiver for client 
 unix:/home/work/app/hdfs/c3prc-micloud/datanode/dn_socket [Waiting for 
 operation #1] daemon prio=10 tid=0x7f0c9c086d90 nid=0x8fc3 waiting on 
 condition [0x7f09b9856000]
java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  0x0007b0174808 (a 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:323)
 at 
 org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:322)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:403)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:214)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:95)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 IMO, we should exit the DN so that the users can know that something go  
 wrong  and fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8429) Death of watcherThread making other local read blocked

2015-05-21 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553702#comment-14553702
 ] 

zhouyingchao commented on HDFS-8429:


Yes, the modification of DomainSocketWatcher#add and DomainSocketWatcher#remove 
is not needed.  I changed the patch accordingly and added a unit test case as 
suggested.  The code of the test case is almost borrowed from the testStress() 
in the same file. Thank you.

 Death of watcherThread making other local read blocked
 --

 Key: HDFS-8429
 URL: https://issues.apache.org/jira/browse/HDFS-8429
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8429-001.patch


 In our cluster, an application is hung when doing a short circuit read of 
 local hdfs block. By looking into the log, we found the DataNode's 
 DomainSocketWatcher.watcherThread has exited with following log:
 {code}
 ERROR org.apache.hadoop.net.unix.DomainSocketWatcher: 
 Thread[Thread-25,5,main] terminating on unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:463)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The line 463 is following code snippet:
 {code}
  try {
 for (int fd : fdSet.getAndClearReadableFds()) {
   sendCallbackAndRemove(getAndClearReadableFds, entries, fdSet,
 fd);
 }
 {code}
 getAndClearReadableFds is a native method which will malloc an int array. 
 Since our memory is very tight, it looks like the malloc failed and a NULL 
 pointer is returned.
 The bad thing is that other threads then blocked in stack like this:
 {code}
 DataXceiver for client 
 unix:/home/work/app/hdfs/c3prc-micloud/datanode/dn_socket [Waiting for 
 operation #1] daemon prio=10 tid=0x7f0c9c086d90 nid=0x8fc3 waiting on 
 condition [0x7f09b9856000]
java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  0x0007b0174808 (a 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:323)
 at 
 org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:322)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:403)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:214)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:95)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 IMO, we should exit the DN so that the users can know that something go  
 wrong  and fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8429) Death of watcherThread making other local read blocked

2015-05-20 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14552095#comment-14552095
 ] 

zhouyingchao commented on HDFS-8429:


Colin, thank you for the great comments.  In this case, I think the bottom line 
is that the death of the watcher thread should not block other threads and the 
client side should be indicated to fall through to other ways as quick as 
possible.
I created a patch trying to resolve the blocking. Besides that, I also changed 
the native getAndClearReadableFds method to throw exception as Colin mentioned. 
 Please feel free to post your thoughts and comments. Thank you.

 Death of watcherThread making other local read blocked
 --

 Key: HDFS-8429
 URL: https://issues.apache.org/jira/browse/HDFS-8429
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 In our cluster, an application is hung when doing a short circuit read of 
 local hdfs block. By looking into the log, we found the DataNode's 
 DomainSocketWatcher.watcherThread has exited with following log:
 {code}
 ERROR org.apache.hadoop.net.unix.DomainSocketWatcher: 
 Thread[Thread-25,5,main] terminating on unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:463)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The line 463 is following code snippet:
 {code}
  try {
 for (int fd : fdSet.getAndClearReadableFds()) {
   sendCallbackAndRemove(getAndClearReadableFds, entries, fdSet,
 fd);
 }
 {code}
 getAndClearReadableFds is a native method which will malloc an int array. 
 Since our memory is very tight, it looks like the malloc failed and a NULL 
 pointer is returned.
 The bad thing is that other threads then blocked in stack like this:
 {code}
 DataXceiver for client 
 unix:/home/work/app/hdfs/c3prc-micloud/datanode/dn_socket [Waiting for 
 operation #1] daemon prio=10 tid=0x7f0c9c086d90 nid=0x8fc3 waiting on 
 condition [0x7f09b9856000]
java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  0x0007b0174808 (a 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:323)
 at 
 org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:322)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:403)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:214)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:95)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 IMO, we should exit the DN so that the users can know that something go  
 wrong  and fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8429) Death of watcherThread making other local read blocked

2015-05-20 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8429:
---
Status: Patch Available  (was: Open)

 Death of watcherThread making other local read blocked
 --

 Key: HDFS-8429
 URL: https://issues.apache.org/jira/browse/HDFS-8429
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8429-001.patch


 In our cluster, an application is hung when doing a short circuit read of 
 local hdfs block. By looking into the log, we found the DataNode's 
 DomainSocketWatcher.watcherThread has exited with following log:
 {code}
 ERROR org.apache.hadoop.net.unix.DomainSocketWatcher: 
 Thread[Thread-25,5,main] terminating on unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:463)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The line 463 is following code snippet:
 {code}
  try {
 for (int fd : fdSet.getAndClearReadableFds()) {
   sendCallbackAndRemove(getAndClearReadableFds, entries, fdSet,
 fd);
 }
 {code}
 getAndClearReadableFds is a native method which will malloc an int array. 
 Since our memory is very tight, it looks like the malloc failed and a NULL 
 pointer is returned.
 The bad thing is that other threads then blocked in stack like this:
 {code}
 DataXceiver for client 
 unix:/home/work/app/hdfs/c3prc-micloud/datanode/dn_socket [Waiting for 
 operation #1] daemon prio=10 tid=0x7f0c9c086d90 nid=0x8fc3 waiting on 
 condition [0x7f09b9856000]
java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  0x0007b0174808 (a 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:323)
 at 
 org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:322)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:403)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:214)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:95)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 IMO, we should exit the DN so that the users can know that something go  
 wrong  and fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8429) Death of watcherThread making other local read blocked

2015-05-20 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8429:
---
Attachment: HDFS-8429-001.patch

Test following cases  : 
TestDomainSocket,TestDomainSocketWatcher,TestParallelShortCircuitRead,TestFsDatasetCacheRevocation,TestFatasetCacheRevocation,TestScrLazyPersistFiles,TestParallelShortCircuitReadNoChecksum,TestDFSInputStream,TestBlockReaderFactory,TestParallelUnixDomainRead,TestParallelShortCircuitReadUnCached,TestBlockReaderLocalLegacy,TestPeerCache,TestShortCircuitCache,TestShortCircuitLocalRead,TestBlockReaderLocal,TestParallelShortCircuitLegacyRead,TestTracingShortCircuitLocalRead,TestEnhancedByteBufferAccess

 Death of watcherThread making other local read blocked
 --

 Key: HDFS-8429
 URL: https://issues.apache.org/jira/browse/HDFS-8429
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8429-001.patch


 In our cluster, an application is hung when doing a short circuit read of 
 local hdfs block. By looking into the log, we found the DataNode's 
 DomainSocketWatcher.watcherThread has exited with following log:
 {code}
 ERROR org.apache.hadoop.net.unix.DomainSocketWatcher: 
 Thread[Thread-25,5,main] terminating on unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:463)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The line 463 is following code snippet:
 {code}
  try {
 for (int fd : fdSet.getAndClearReadableFds()) {
   sendCallbackAndRemove(getAndClearReadableFds, entries, fdSet,
 fd);
 }
 {code}
 getAndClearReadableFds is a native method which will malloc an int array. 
 Since our memory is very tight, it looks like the malloc failed and a NULL 
 pointer is returned.
 The bad thing is that other threads then blocked in stack like this:
 {code}
 DataXceiver for client 
 unix:/home/work/app/hdfs/c3prc-micloud/datanode/dn_socket [Waiting for 
 operation #1] daemon prio=10 tid=0x7f0c9c086d90 nid=0x8fc3 waiting on 
 condition [0x7f09b9856000]
java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  0x0007b0174808 (a 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:323)
 at 
 org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:322)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:403)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:214)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:95)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 IMO, we should exit the DN so that the users can know that something go  
 wrong  and fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8429) Death of watcherThread making other local read blocked

2015-05-19 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-8429:
--

 Summary: Death of watcherThread making other local read blocked
 Key: HDFS-8429
 URL: https://issues.apache.org/jira/browse/HDFS-8429
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao


In our cluster, an application is hung when doing a short circuit read of local 
hdfs block. By looking into the log, we found the DataNode's 
DomainSocketWatcher.watcherThread has exited with following log:
{code}
ERROR org.apache.hadoop.net.unix.DomainSocketWatcher: Thread[Thread-25,5,main] 
terminating on unexpected exception
java.lang.NullPointerException
at 
org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:463)
at java.lang.Thread.run(Thread.java:662)
{code}

The line 463 is following code snippet:
{code}
 try {
for (int fd : fdSet.getAndClearReadableFds()) {
  sendCallbackAndRemove(getAndClearReadableFds, entries, fdSet,
fd);
}
{code}

getAndClearReadableFds is a native method which will malloc an int array. Since 
our memory is very tight, it looks like the malloc failed and a NULL pointer is 
returned.

The bad thing is that other threads then blocked in stack like this:
{code}
DataXceiver for client 
unix:/home/work/app/hdfs/c3prc-micloud/datanode/dn_socket [Waiting for 
operation #1] daemon prio=10 tid=0x7f0c9c086d90 nid=0x8fc3 waiting on 
condition [0x7f09b9856000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  0x0007b0174808 (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
at 
org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:323)
at 
org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:322)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:403)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:214)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:95)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
at java.lang.Thread.run(Thread.java:662)
{code}

IMO, we should exit the DN so that the users can know that something go  wrong  
and fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8429) Death of watcherThread making other local read blocked

2015-05-19 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14550293#comment-14550293
 ] 

zhouyingchao commented on HDFS-8429:


[~cmccabe]  Should we stop DN in this condition?


 Death of watcherThread making other local read blocked
 --

 Key: HDFS-8429
 URL: https://issues.apache.org/jira/browse/HDFS-8429
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 In our cluster, an application is hung when doing a short circuit read of 
 local hdfs block. By looking into the log, we found the DataNode's 
 DomainSocketWatcher.watcherThread has exited with following log:
 {code}
 ERROR org.apache.hadoop.net.unix.DomainSocketWatcher: 
 Thread[Thread-25,5,main] terminating on unexpected exception
 java.lang.NullPointerException
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:463)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 The line 463 is following code snippet:
 {code}
  try {
 for (int fd : fdSet.getAndClearReadableFds()) {
   sendCallbackAndRemove(getAndClearReadableFds, entries, fdSet,
 fd);
 }
 {code}
 getAndClearReadableFds is a native method which will malloc an int array. 
 Since our memory is very tight, it looks like the malloc failed and a NULL 
 pointer is returned.
 The bad thing is that other threads then blocked in stack like this:
 {code}
 DataXceiver for client 
 unix:/home/work/app/hdfs/c3prc-micloud/datanode/dn_socket [Waiting for 
 operation #1] daemon prio=10 tid=0x7f0c9c086d90 nid=0x8fc3 waiting on 
 condition [0x7f09b9856000]
java.lang.Thread.State: WAITING (parking)
 at sun.misc.Unsafe.park(Native Method)
 - parking to wait for  0x0007b0174808 (a 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
 at java.util.concurrent.locks.LockSupport.park(LockSupport.java:156)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
 at 
 org.apache.hadoop.net.unix.DomainSocketWatcher.add(DomainSocketWatcher.java:323)
 at 
 org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(ShortCircuitRegistry.java:322)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(DataXceiver.java:403)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(Receiver.java:214)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:95)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)
 {code}
 IMO, we should exit the DN so that the users can know that something go  
 wrong  and fix it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-8419) chmod impact user's effective ACL

2015-05-18 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-8419:
--

 Summary: chmod impact user's effective ACL
 Key: HDFS-8419
 URL: https://issues.apache.org/jira/browse/HDFS-8419
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao


I set a directory's ACL to assign rwx permission to a user h_user1. Later, I 
used chmod to change the group permission to r-x. I understand chmod of a acl 
enabled file would only change the permission mask. What's make me surprise is 
that the operation will change the h_user1's effective ACL from rwx to r-x.

Following are ACLs before any operaton:
-
# file: /grptest
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
user:h_user1:rwx
group::r-x
mask::rwx
other::---
-

Following are ACLs after chmod 750 /grptest
-
# file: /grptest
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
user:hdfs_admin:rwx #effective:r-x
group::r-x
mask::r-x
other::---
# file: /grptest
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
user:h_user1:rwx#effective:r-x
group::r-x
mask::r-x
other::---
-

I'm wondering if this behavior is by design.  If not, I'd like to fix the 
issue. Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8419) chmod impact user's effective ACL

2015-05-18 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547909#comment-14547909
 ] 

zhouyingchao commented on HDFS-8419:


@Chris Nauroth

 chmod impact user's effective ACL
 -

 Key: HDFS-8419
 URL: https://issues.apache.org/jira/browse/HDFS-8419
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 I set a directory's ACL to assign rwx permission to a user h_user1. Later, I 
 used chmod to change the group permission to r-x. I understand chmod of a acl 
 enabled file would only change the permission mask. What's make me surprise 
 is that the operation will change the h_user1's effective ACL from rwx to r-x.
 Following are ACLs before any operaton:
 -
 # file: /grptest
 # owner: hdfs_tst_admin
 # group: supergroup
 user::rwx
 user:h_user1:rwx
 group::r-x
 mask::rwx
 other::---
 -
 Following are ACLs after chmod 750 /grptest
 -
 # file: /grptest
 # owner: hdfs_tst_admin
 # group: supergroup
 user::rwx
 user:hdfs_admin:rwx   #effective:r-x
 group::r-x
 mask::r-x
 other::---
 # file: /grptest
 # owner: hdfs_tst_admin
 # group: supergroup
 user::rwx
 user:h_user1:rwx  #effective:r-x
 group::r-x
 mask::r-x
 other::---
 -
 I'm wondering if this behavior is by design.  If not, I'd like to fix the 
 issue. Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8419) chmod impact user's effective ACL

2015-05-18 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8419:
---
Description: 
I set a directory's ACL to assign rwx permission to a user h_user1. Later, I 
used chmod to change the group permission to r-x. I understand chmod of a acl 
enabled file would only change the permission mask. What's make me surprise is 
that the operation will change the h_user1's effective ACL from rwx to r-x.

Following are ACLs before any operaton:
-
# file: /grptest
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
user:h_user1:rwx
group::r-x
mask::rwx
other::---
-

Following are ACLs after chmod 750 /grptest
-
# file: /grptest
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
user:hdfs_admin:rwx #effective:r-x
group::r-x
mask::r-x
other::---
-

I'm wondering if this behavior is by design.  If not, I'd like to fix the 
issue. Thank you.

  was:
I set a directory's ACL to assign rwx permission to a user h_user1. Later, I 
used chmod to change the group permission to r-x. I understand chmod of a acl 
enabled file would only change the permission mask. What's make me surprise is 
that the operation will change the h_user1's effective ACL from rwx to r-x.

Following are ACLs before any operaton:
-
# file: /grptest
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
user:h_user1:rwx
group::r-x
mask::rwx
other::---
-

Following are ACLs after chmod 750 /grptest
-
# file: /grptest
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
user:hdfs_admin:rwx #effective:r-x
group::r-x
mask::r-x
other::---
# file: /grptest
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
user:h_user1:rwx#effective:r-x
group::r-x
mask::r-x
other::---
-

I'm wondering if this behavior is by design.  If not, I'd like to fix the 
issue. Thank you.


 chmod impact user's effective ACL
 -

 Key: HDFS-8419
 URL: https://issues.apache.org/jira/browse/HDFS-8419
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 I set a directory's ACL to assign rwx permission to a user h_user1. Later, I 
 used chmod to change the group permission to r-x. I understand chmod of a acl 
 enabled file would only change the permission mask. What's make me surprise 
 is that the operation will change the h_user1's effective ACL from rwx to r-x.
 Following are ACLs before any operaton:
 -
 # file: /grptest
 # owner: hdfs_tst_admin
 # group: supergroup
 user::rwx
 user:h_user1:rwx
 group::r-x
 mask::rwx
 other::---
 -
 Following are ACLs after chmod 750 /grptest
 -
 # file: /grptest
 # owner: hdfs_tst_admin
 # group: supergroup
 user::rwx
 user:hdfs_admin:rwx   #effective:r-x
 group::r-x
 mask::r-x
 other::---
 -
 I'm wondering if this behavior is by design.  If not, I'd like to fix the 
 issue. Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8419) chmod impact user's effective ACL

2015-05-18 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8419:
---
Description: 
I set a directory's ACL to assign rwx permission to a user h_user1. Later, I 
used chmod to change the group permission to r-x. I understand chmod of a acl 
enabled file would only change the permission mask. What's make me surprise is 
that the operation will change the h_user1's effective ACL from rwx to r-x.

Following are ACLs before any operaton:
-
\# file: /grptest
\# owner: hdfs_tst_admin
\# group: supergroup
user::rwx
user:h_user1:rwx
group::r-x
mask::rwx
other::---
-

Following are ACLs after chmod 750 /grptest
-
\# file: /grptest
\# owner: hdfs_tst_admin
\# group: supergroup
user::rwx
user:hdfs_admin:rwx #effective:r-x
group::r-x
mask::r-x
other::---
-

I'm wondering if this behavior is by design.  If not, I'd like to fix the 
issue. Thank you.

  was:
I set a directory's ACL to assign rwx permission to a user h_user1. Later, I 
used chmod to change the group permission to r-x. I understand chmod of a acl 
enabled file would only change the permission mask. What's make me surprise is 
that the operation will change the h_user1's effective ACL from rwx to r-x.

Following are ACLs before any operaton:
-
# file: /grptest
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
user:h_user1:rwx
group::r-x
mask::rwx
other::---
-

Following are ACLs after chmod 750 /grptest
-
# file: /grptest
# owner: hdfs_tst_admin
# group: supergroup
user::rwx
user:hdfs_admin:rwx #effective:r-x
group::r-x
mask::r-x
other::---
-

I'm wondering if this behavior is by design.  If not, I'd like to fix the 
issue. Thank you.


 chmod impact user's effective ACL
 -

 Key: HDFS-8419
 URL: https://issues.apache.org/jira/browse/HDFS-8419
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 I set a directory's ACL to assign rwx permission to a user h_user1. Later, I 
 used chmod to change the group permission to r-x. I understand chmod of a acl 
 enabled file would only change the permission mask. What's make me surprise 
 is that the operation will change the h_user1's effective ACL from rwx to r-x.
 Following are ACLs before any operaton:
 -
 \# file: /grptest
 \# owner: hdfs_tst_admin
 \# group: supergroup
 user::rwx
 user:h_user1:rwx
 group::r-x
 mask::rwx
 other::---
 -
 Following are ACLs after chmod 750 /grptest
 -
 \# file: /grptest
 \# owner: hdfs_tst_admin
 \# group: supergroup
 user::rwx
 user:hdfs_admin:rwx   #effective:r-x
 group::r-x
 mask::r-x
 other::---
 -
 I'm wondering if this behavior is by design.  If not, I'd like to fix the 
 issue. Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8419) chmod impact user's effective ACL

2015-05-18 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8419:
---
Description: 
I set a directory's ACL to assign rwx permission to user h_user1. Later, I used 
chmod to change the group permission to r-x. I understand chmod of an acl 
enabled file would only change the permission mask. The abnormal thing is that 
the operation will change the h_user1's effective ACL from rwx to r-x.

Following are ACLs before any operaton:
-
\# file: /grptest
\# owner: hdfs_tst_admin
\# group: supergroup
user::rwx
user:h_user1:rwx
group::r-x
mask::rwx
other::---
-

Following are ACLs after chmod 750 /grptest
-
\# file: /grptest
\# owner: hdfs_tst_admin
\# group: supergroup
user::rwx
user:hdfs_admin:rwx #effective:r-x
group::r-x
mask::r-x
other::---
-

I'm wondering if this behavior is by design.  If not, I'd like to fix the 
issue. Thank you.

  was:
I set a directory's ACL to assign rwx permission to a user h_user1. Later, I 
used chmod to change the group permission to r-x. I understand chmod of a acl 
enabled file would only change the permission mask. What's make me surprise is 
that the operation will change the h_user1's effective ACL from rwx to r-x.

Following are ACLs before any operaton:
-
\# file: /grptest
\# owner: hdfs_tst_admin
\# group: supergroup
user::rwx
user:h_user1:rwx
group::r-x
mask::rwx
other::---
-

Following are ACLs after chmod 750 /grptest
-
\# file: /grptest
\# owner: hdfs_tst_admin
\# group: supergroup
user::rwx
user:hdfs_admin:rwx #effective:r-x
group::r-x
mask::r-x
other::---
-

I'm wondering if this behavior is by design.  If not, I'd like to fix the 
issue. Thank you.


 chmod impact user's effective ACL
 -

 Key: HDFS-8419
 URL: https://issues.apache.org/jira/browse/HDFS-8419
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 I set a directory's ACL to assign rwx permission to user h_user1. Later, I 
 used chmod to change the group permission to r-x. I understand chmod of an 
 acl enabled file would only change the permission mask. The abnormal thing is 
 that the operation will change the h_user1's effective ACL from rwx to r-x.
 Following are ACLs before any operaton:
 -
 \# file: /grptest
 \# owner: hdfs_tst_admin
 \# group: supergroup
 user::rwx
 user:h_user1:rwx
 group::r-x
 mask::rwx
 other::---
 -
 Following are ACLs after chmod 750 /grptest
 -
 \# file: /grptest
 \# owner: hdfs_tst_admin
 \# group: supergroup
 user::rwx
 user:hdfs_admin:rwx   #effective:r-x
 group::r-x
 mask::r-x
 other::---
 -
 I'm wondering if this behavior is by design.  If not, I'd like to fix the 
 issue. Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8419) chmod impact user's effective ACL

2015-05-18 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8419:
---
Description: 
I set a directory's ACL to assign rwx permission to user h_user1. Later, I used 
chmod to change the group permission to r-x. I understand chmod of an acl 
enabled file would only change the permission mask. The abnormal thing is that 
the operation will change the h_user1's effective ACL from rwx to r-x.

Following are ACLs before any operaton:
-
\# file: /grptest
\# owner: hdfs_tst_admin
\# group: supergroup
user::rwx
user:h_user1:rwx
group::r-x
mask::rwx
other::---
-

Following are ACLs after chmod 750 /grptest
-
\# file: /grptest
\# owner: hdfs_tst_admin
\# group: supergroup
user::rwx
user:h_user1:rwx#effective:r-x
group::r-x
mask::r-x
other::---
-

I'm wondering if this behavior is by design.  If not, I'd like to fix the 
issue. Thank you.

  was:
I set a directory's ACL to assign rwx permission to user h_user1. Later, I used 
chmod to change the group permission to r-x. I understand chmod of an acl 
enabled file would only change the permission mask. The abnormal thing is that 
the operation will change the h_user1's effective ACL from rwx to r-x.

Following are ACLs before any operaton:
-
\# file: /grptest
\# owner: hdfs_tst_admin
\# group: supergroup
user::rwx
user:h_user1:rwx
group::r-x
mask::rwx
other::---
-

Following are ACLs after chmod 750 /grptest
-
\# file: /grptest
\# owner: hdfs_tst_admin
\# group: supergroup
user::rwx
user:hdfs_admin:rwx #effective:r-x
group::r-x
mask::r-x
other::---
-

I'm wondering if this behavior is by design.  If not, I'd like to fix the 
issue. Thank you.


 chmod impact user's effective ACL
 -

 Key: HDFS-8419
 URL: https://issues.apache.org/jira/browse/HDFS-8419
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: HDFS
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 I set a directory's ACL to assign rwx permission to user h_user1. Later, I 
 used chmod to change the group permission to r-x. I understand chmod of an 
 acl enabled file would only change the permission mask. The abnormal thing is 
 that the operation will change the h_user1's effective ACL from rwx to r-x.
 Following are ACLs before any operaton:
 -
 \# file: /grptest
 \# owner: hdfs_tst_admin
 \# group: supergroup
 user::rwx
 user:h_user1:rwx
 group::r-x
 mask::rwx
 other::---
 -
 Following are ACLs after chmod 750 /grptest
 -
 \# file: /grptest
 \# owner: hdfs_tst_admin
 \# group: supergroup
 user::rwx
 user:h_user1:rwx  #effective:r-x
 group::r-x
 mask::r-x
 other::---
 -
 I'm wondering if this behavior is by design.  If not, I'd like to fix the 
 issue. Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7897) Shutdown metrics when stopping JournalNode

2015-04-30 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521202#comment-14521202
 ] 

zhouyingchao commented on HDFS-7897:


Any updates regarding this simple patch?

 Shutdown metrics when stopping JournalNode
 --

 Key: HDFS-7897
 URL: https://issues.apache.org/jira/browse/HDFS-7897
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7897-001.patch


 In JournalNode.stop(), the metrics system is forgotten to shutdown. The issue 
 is found when reading the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7999) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time

2015-04-04 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7999:
---
Attachment: HDFS-7999-003.patch

 FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a 
 very long time
 -

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch, HDFS-7999-002.patch, 
 HDFS-7999-003.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7999) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time

2015-04-04 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14395791#comment-14395791
 ] 

zhouyingchao commented on HDFS-7999:


Thank you, Colin. I'll update the patch soon.

 FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a 
 very long time
 -

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch, HDFS-7999-002.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7999) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time

2015-04-02 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392426#comment-14392426
 ] 

zhouyingchao commented on HDFS-7999:


Thanks a lot for the comments, Colin.  I would update the patch accordingly.

 FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a 
 very long time
 -

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7999) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time

2015-04-02 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7999:
---
Attachment: HDFS-7999-002.patch

Test with 
-Dtest=FsDatasetTestUtil,LazyPersistTestCase,TestDatanodeRestart,TestFsDatasetImpl,TestFsVolumeList,TestInterDatanodeProtocol,TestLazyPersistFiles,TestRbwSpaceReservation,TestReplicaMap,TestScrLazyPersistFiles,TestWriteToReplica,TestBalancer,TestDatanodeManager

 FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a 
 very long time
 -

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch, HDFS-7999-002.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8045) Incorrect calculation of NonDfsUsed and Remaining

2015-04-02 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8045:
---
Attachment: HDFS-8045-001.patch

Test with 
-Dtest=TestSimulatedFSDataset,TestNamenodeCapacityReport,TestFileCreation,TestDecommission

 Incorrect calculation of NonDfsUsed and Remaining
 -

 Key: HDFS-8045
 URL: https://issues.apache.org/jira/browse/HDFS-8045
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8045-001.patch


 After reserve some space via the param dfs.datanode.du.reserved, we noticed 
 that the namenode usually report NonDfsUsed of Datanodes as 0 even if we 
 write some non-hdfs data to the volume. After some investigation, we think 
 there is an issue in the calculation of FsVolumeImpl.getAvailable - following 
 is the explaination.
 For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
 space consumed by hdfs blocks, Reserved to represent reservation through 
 dfs.datanode.du.reserved, RbwReserved to represent space reservation for 
 rbw blocks, RealNonDfsUsed to represent real value of NonDfsUsed(which will 
 include non-hdfs files and meta data consumed by local filesystem).
 In current implementation, for a volume, available space will be actually 
 calculated as  
 {code}
 min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - RealNonDfsUsed }
 {code}
 Later on, Namenode will calculate NonDfsUsed of the volume as 
 {code}
 Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - 
 DfsUsed - RealNonDfsUsed}
 {code}
 Given the calculation, finally we will have -
 {code}
 if (Reserved + RbwReserved  RealNonDfsUsed) NonDfsUsed = RbwReserved;
 else NonDfsUsed = RealNonDfsUsed - Reserved;
 {code}
 Either way it is far from the correct value.
 After investigating the implementation, we believe the Reserved and 
 RbwReserved should be subtract from available in getAvailable since they are 
 actually not available to hdfs in any sense.  I'll post a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8045) Incorrect calculation of NonDfsUsed and Remaining

2015-04-02 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8045:
---
Description: 
After reserve some space via the param dfs.datanode.du.reserved, we noticed 
that the namenode usually report NonDfsUsed of Datanodes as 0 even if we write 
some non-hdfs data to the volume. After some investigation, we think there is 
an issue in the calculation of FsVolumeImpl.getAvailable - following is the 
explaination.

For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
space consumed by hdfs blocks, Reserved to represent reservation through 
dfs.datanode.du.reserved, RbwReserved to represent space reservation for rbw 
blocks, NDfsUsed to represent real value of NonDfsUsed(which will include 
non-hdfs files and meta data consumed by local filesystem).
In current implementation, for a volume, available space will be actually 
calculated as  
{code}
min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - NDfsUsed }
{code}
Later on, Namenode will calculate NonDfsUsed of the volume as 
{code}
Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - 
DfsUsed - NDfsUsed}
{code}

Given the calculation, finally we will have -
{code}
if (Reserved + RbwReserved  NDfsUsed) NonDfsUsed = RbwReserved;
else NonDfsUsed = NDfsUsed - Reserved;
{code}
Either way it is far from a correct value.

After investigation the implementation, we believe the Reserved and RbwReserved 
should be subtract from available in getAvailable since they are actually not 
available to hdfs in any way.  I'll post a patch soon.

  was:
After reserve some space via the param dfs.datanode.du.reserved, we noticed 
that the namenode usually report NonDfsUsed of Datanodes as 0 even if we 
actually write some data to the volume. After some investigation, we think 
there is an issue in the calculation of FsVolumeImpl.getAvailable - following 
is the explaination.

For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
space consumed by hdfs blocks, Reserved to represent reservation through 
dfs.datanode.du.reserved, RbwReserved to represent space reservation for rbw 
blocks, NDfsUsed to represent real value of NonDfsUsed(which will include 
non-hdfs files and meta data consumed by local filesystem).
In current implementation, for a volume, available space will be actually 
calculated as  
{code}
min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - NDfsUsed }
{code}
Later on, Namenode will calculate NonDfsUsed of the volume as 
{code}
Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - 
DfsUsed - NDfsUsed}
{code}

Given the calculation, finally we will have -
{code}
if (Reserved + RbwReserved  NDfsUsed) NonDfsUsed = RbwReserved;
else NonDfsUsed = NDfsUsed - Reserved;
{code}
Either way it is far from a correct value.

After investigation the implementation, we believe the Reserved and RbwReserved 
should be subtract from available in getAvailable since they are actually not 
available to hdfs in any way.  I'll post a patch soon.


 Incorrect calculation of NonDfsUsed and Remaining
 -

 Key: HDFS-8045
 URL: https://issues.apache.org/jira/browse/HDFS-8045
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8045-001.patch


 After reserve some space via the param dfs.datanode.du.reserved, we noticed 
 that the namenode usually report NonDfsUsed of Datanodes as 0 even if we 
 write some non-hdfs data to the volume. After some investigation, we think 
 there is an issue in the calculation of FsVolumeImpl.getAvailable - following 
 is the explaination.
 For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
 space consumed by hdfs blocks, Reserved to represent reservation through 
 dfs.datanode.du.reserved, RbwReserved to represent space reservation for 
 rbw blocks, NDfsUsed to represent real value of NonDfsUsed(which will include 
 non-hdfs files and meta data consumed by local filesystem).
 In current implementation, for a volume, available space will be actually 
 calculated as  
 {code}
 min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - NDfsUsed }
 {code}
 Later on, Namenode will calculate NonDfsUsed of the volume as 
 {code}
 Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - 
 DfsUsed - NDfsUsed}
 {code}
 Given the calculation, finally we will have -
 {code}
 if (Reserved + RbwReserved  NDfsUsed) NonDfsUsed = RbwReserved;
 else NonDfsUsed = NDfsUsed - Reserved;
 {code}
 Either way it is far from a correct value.
 After investigation the implementation, we believe the Reserved and 
 RbwReserved should be subtract from available in getAvailable 

[jira] [Updated] (HDFS-8045) Incorrect calculation of NonDfsUsed and Remaining

2015-04-02 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8045:
---
Description: 
After reserve some space via the param dfs.datanode.du.reserved, we noticed 
that the namenode usually report NonDfsUsed of Datanodes as 0 even if we 
actually write some data to the volume. After some investigation, we think 
there is an issue in the calculation of FsVolumeImpl.getAvailable - following 
is the explaination.

For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
space consumed by hdfs blocks, Reserved to represent reservation through 
dfs.datanode.du.reserved, RbwReserved to represent space reservation for rbw 
blocks, NDfsUsed to represent real value of NonDfsUsed(which will include 
non-hdfs files and meta data consumed by local filesystem).
In current implementation, for a volume, available space will be actually 
calculated as  
{code}
min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - NDfsUsed }
{code}
Later on, Namenode will calculate NonDfsUsed of the volume as 
{code}
Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - 
DfsUsed - NDfsUsed}
{code}

Given the calculation, finally we will have -
{code}
if (Reserved + RbwReserved  NDfsUsed) NonDfsUsed = RbwReserved;
else NonDfsUsed = NDfsUsed - Reserved;
{code}
Either way it is far from a correct value.

After investigation the implementation, we believe the Reserved and RbwReserved 
should be subtract from available in getAvailable since they are actually not 
available to hdfs in any way.  I'll post a patch soon.

  was:
After reserve some space via the param dfs.datanode.du.reserved, we noticed 
that the namenode usually report NonDfsUsed of Datanodes as 0 even if we 
actually write some data to the volume. After some investigation, we think 
there is an issue in the calculation of FsVolumeImpl.getAvailable - following 
is the explaination.

For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
space consumed by hdfs blocks, Reserved to represent reservation through 
dfs.datanode.du.reserved, RbwReserved to represent space reservation for rbw 
blocks, NDfsUsed to represent real value of NonDfsUsed(which will include 
non-hdfs files and meta data consumed by local filesystem).
In current implementation, for a volume, available space will be actually 
calculated as  min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - 
NDfsUsed }. 
Later on, Namenode will calculate NonDfsUsed of the volume as Raw - Reserved - 
DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - DfsUsed - 
NDfsUsed}.

Given the calculation, finally we will have -
if Reserved + RbwReserved  NDfsUsed, then the calculated NonDfsUsed will be 
RbwReserved. Otherwise if Reserved + RbwReserved  NDfsUsed, then the 
calculated NonDfsUsed would be NDfsUsed - Reserved. Either way it is far from 
a correct value.

After investigation the implementation, we believe the Reserved and RbwReserved 
should be subtract from available in getAvailable since they are actually not 
available to hdfs in any way.  I'll post a patch soon.


 Incorrect calculation of NonDfsUsed and Remaining
 -

 Key: HDFS-8045
 URL: https://issues.apache.org/jira/browse/HDFS-8045
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8045-001.patch


 After reserve some space via the param dfs.datanode.du.reserved, we noticed 
 that the namenode usually report NonDfsUsed of Datanodes as 0 even if we 
 actually write some data to the volume. After some investigation, we think 
 there is an issue in the calculation of FsVolumeImpl.getAvailable - following 
 is the explaination.
 For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
 space consumed by hdfs blocks, Reserved to represent reservation through 
 dfs.datanode.du.reserved, RbwReserved to represent space reservation for 
 rbw blocks, NDfsUsed to represent real value of NonDfsUsed(which will include 
 non-hdfs files and meta data consumed by local filesystem).
 In current implementation, for a volume, available space will be actually 
 calculated as  
 {code}
 min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - NDfsUsed }
 {code}
 Later on, Namenode will calculate NonDfsUsed of the volume as 
 {code}
 Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - 
 DfsUsed - NDfsUsed}
 {code}
 Given the calculation, finally we will have -
 {code}
 if (Reserved + RbwReserved  NDfsUsed) NonDfsUsed = RbwReserved;
 else NonDfsUsed = NDfsUsed - Reserved;
 {code}
 Either way it is far from a correct value.
 After investigation the implementation, we believe the Reserved and 
 

[jira] [Created] (HDFS-8045) Incorrect calculation of NonDfsUsed and Remaining

2015-04-02 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-8045:
--

 Summary: Incorrect calculation of NonDfsUsed and Remaining
 Key: HDFS-8045
 URL: https://issues.apache.org/jira/browse/HDFS-8045
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8045-001.patch

After reserve some space via the param dfs.datanode.du.reserved, we noticed 
that the namenode usually report NonDfsUsed of Datanodes as 0 even if we 
actually write some data to the volume. After some investigation, we think 
there is an issue in the calculation of FsVolumeImpl.getAvailable - following 
is the explaination.

For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
space consumed by hdfs blocks, Reserved to represent reservation through 
dfs.datanode.du.reserved, RbwReserved to represent space reservation for rbw 
blocks, NDfsUsed to represent real value of NonDfsUsed(which will include 
non-hdfs files and meta data consumed by local filesystem).
In current implementation, for a volume, available space will be actually 
calculated as  min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - 
NDfsUsed }. 
Later on, Namenode will calculate NonDfsUsed of the volume as Raw - Reserved - 
DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - DfsUsed - 
NDfsUsed}.

Given the calculation, finally we will have -
if Reserved + RbwReserved  NDfsUsed, then the calculated NonDfsUsed will be 
RbwReserved. Otherwise if Reserved + RbwReserved  NDfsUsed, then the 
calculated NonDfsUsed would be NDfsUsed - Reserved. Either way it is far from 
a correct value.

After investigation the implementation, we believe the Reserved and RbwReserved 
should be subtract from available in getAvailable since they are actually not 
available to hdfs in any way.  I'll post a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8045) Incorrect calculation of NonDfsUsed and Remaining

2015-04-02 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8045:
---
Component/s: datanode

 Incorrect calculation of NonDfsUsed and Remaining
 -

 Key: HDFS-8045
 URL: https://issues.apache.org/jira/browse/HDFS-8045
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8045-001.patch


 After reserve some space via the param dfs.datanode.du.reserved, we noticed 
 that the namenode usually report NonDfsUsed of Datanodes as 0 even if we 
 write some non-hdfs data to the volume. After some investigation, we think 
 there is an issue in the calculation of FsVolumeImpl.getAvailable - following 
 is the explaination.
 For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
 space consumed by hdfs blocks, Reserved to represent reservation through 
 dfs.datanode.du.reserved, RbwReserved to represent space reservation for 
 rbw blocks, RealNonDfsUsed to represent real value of NonDfsUsed(which will 
 include non-hdfs files and meta data consumed by local filesystem).
 In current implementation, for a volume, available space will be actually 
 calculated as  
 {code}
 min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - RealNonDfsUsed }
 {code}
 Later on, Namenode will calculate NonDfsUsed of the volume as 
 {code}
 Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - 
 DfsUsed - RealNonDfsUsed}
 {code}
 Given the calculation, finally we will have -
 {code}
 if (Reserved + RbwReserved  RealNonDfsUsed) NonDfsUsed = RbwReserved;
 else NonDfsUsed = RealNonDfsUsed - Reserved;
 {code}
 Either way it is far from the correct value.
 After investigating the implementation, we believe the Reserved and 
 RbwReserved should be subtract from available in getAvailable since they are 
 actually not available to hdfs in any sense.  I'll post a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8045) Incorrect calculation of NonDfsUsed and Remaining

2015-04-02 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8045:
---
Affects Version/s: 2.6.0
   Status: Patch Available  (was: Open)

 Incorrect calculation of NonDfsUsed and Remaining
 -

 Key: HDFS-8045
 URL: https://issues.apache.org/jira/browse/HDFS-8045
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8045-001.patch


 After reserve some space via the param dfs.datanode.du.reserved, we noticed 
 that the namenode usually report NonDfsUsed of Datanodes as 0 even if we 
 write some non-hdfs data to the volume. After some investigation, we think 
 there is an issue in the calculation of FsVolumeImpl.getAvailable - following 
 is the explaination.
 For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
 space consumed by hdfs blocks, Reserved to represent reservation through 
 dfs.datanode.du.reserved, RbwReserved to represent space reservation for 
 rbw blocks, RealNonDfsUsed to represent real value of NonDfsUsed(which will 
 include non-hdfs files and meta data consumed by local filesystem).
 In current implementation, for a volume, available space will be actually 
 calculated as  
 {code}
 min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - RealNonDfsUsed }
 {code}
 Later on, Namenode will calculate NonDfsUsed of the volume as 
 {code}
 Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - 
 DfsUsed - RealNonDfsUsed}
 {code}
 Given the calculation, finally we will have -
 {code}
 if (Reserved + RbwReserved  RealNonDfsUsed) NonDfsUsed = RbwReserved;
 else NonDfsUsed = RealNonDfsUsed - Reserved;
 {code}
 Either way it is far from the correct value.
 After investigating the implementation, we believe the Reserved and 
 RbwReserved should be subtract from available in getAvailable since they are 
 actually not available to hdfs in any sense.  I'll post a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-8045) Incorrect calculation of NonDfsUsed and Remaining

2015-04-02 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-8045:
---
Description: 
After reserve some space via the param dfs.datanode.du.reserved, we noticed 
that the namenode usually report NonDfsUsed of Datanodes as 0 even if we write 
some non-hdfs data to the volume. After some investigation, we think there is 
an issue in the calculation of FsVolumeImpl.getAvailable - following is the 
explaination.

For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
space consumed by hdfs blocks, Reserved to represent reservation through 
dfs.datanode.du.reserved, RbwReserved to represent space reservation for rbw 
blocks, RealNonDfsUsed to represent real value of NonDfsUsed(which will include 
non-hdfs files and meta data consumed by local filesystem).
In current implementation, for a volume, available space will be actually 
calculated as  
{code}
min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - RealNonDfsUsed }
{code}
Later on, Namenode will calculate NonDfsUsed of the volume as 
{code}
Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - 
DfsUsed - RealNonDfsUsed}
{code}

Given the calculation, finally we will have -
{code}
if (Reserved + RbwReserved  RealNonDfsUsed) NonDfsUsed = RbwReserved;
else NonDfsUsed = RealNonDfsUsed - Reserved;
{code}
Either way it is far from the correct value.

After investigating the implementation, we believe the Reserved and RbwReserved 
should be subtract from available in getAvailable since they are actually not 
available to hdfs in any sense.  I'll post a patch soon.

  was:
After reserve some space via the param dfs.datanode.du.reserved, we noticed 
that the namenode usually report NonDfsUsed of Datanodes as 0 even if we write 
some non-hdfs data to the volume. After some investigation, we think there is 
an issue in the calculation of FsVolumeImpl.getAvailable - following is the 
explaination.

For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
space consumed by hdfs blocks, Reserved to represent reservation through 
dfs.datanode.du.reserved, RbwReserved to represent space reservation for rbw 
blocks, NDfsUsed to represent real value of NonDfsUsed(which will include 
non-hdfs files and meta data consumed by local filesystem).
In current implementation, for a volume, available space will be actually 
calculated as  
{code}
min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - NDfsUsed }
{code}
Later on, Namenode will calculate NonDfsUsed of the volume as 
{code}
Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - 
DfsUsed - NDfsUsed}
{code}

Given the calculation, finally we will have -
{code}
if (Reserved + RbwReserved  NDfsUsed) NonDfsUsed = RbwReserved;
else NonDfsUsed = NDfsUsed - Reserved;
{code}
Either way it is far from a correct value.

After investigation the implementation, we believe the Reserved and RbwReserved 
should be subtract from available in getAvailable since they are actually not 
available to hdfs in any way.  I'll post a patch soon.


 Incorrect calculation of NonDfsUsed and Remaining
 -

 Key: HDFS-8045
 URL: https://issues.apache.org/jira/browse/HDFS-8045
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-8045-001.patch


 After reserve some space via the param dfs.datanode.du.reserved, we noticed 
 that the namenode usually report NonDfsUsed of Datanodes as 0 even if we 
 write some non-hdfs data to the volume. After some investigation, we think 
 there is an issue in the calculation of FsVolumeImpl.getAvailable - following 
 is the explaination.
 For a volume, let's use Raw to represent raw capacity, DfsUsed to represent 
 space consumed by hdfs blocks, Reserved to represent reservation through 
 dfs.datanode.du.reserved, RbwReserved to represent space reservation for 
 rbw blocks, RealNonDfsUsed to represent real value of NonDfsUsed(which will 
 include non-hdfs files and meta data consumed by local filesystem).
 In current implementation, for a volume, available space will be actually 
 calculated as  
 {code}
 min{Raw - Reserved - DfsUsed -RbwReserved,  Raw - DfsUsed - RealNonDfsUsed }
 {code}
 Later on, Namenode will calculate NonDfsUsed of the volume as 
 {code}
 Raw - Reserved - DfsUsed - min{Raw - Reserved - DfsUsed - RbwReserved, Raw - 
 DfsUsed - RealNonDfsUsed}
 {code}
 Given the calculation, finally we will have -
 {code}
 if (Reserved + RbwReserved  RealNonDfsUsed) NonDfsUsed = RbwReserved;
 else NonDfsUsed = RealNonDfsUsed - Reserved;
 {code}
 Either way it is far from the correct value.
 After investigating the implementation, we believe the Reserved and 

[jira] [Commented] (HDFS-5215) dfs.datanode.du.reserved is not taking effect as it's not considered while getting the available space

2015-04-02 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14392609#comment-14392609
 ] 

zhouyingchao commented on HDFS-5215:


Shouldn't we also subtract rbwReserved ?

 dfs.datanode.du.reserved is not taking effect as it's not considered while 
 getting the available space
 --

 Key: HDFS-5215
 URL: https://issues.apache.org/jira/browse/HDFS-5215
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 3.0.0
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
 Attachments: HDFS-5215-002.patch, HDFS-5215-003.patch, HDFS-5215.patch


 {code}public long getAvailable() throws IOException {
 long remaining = getCapacity()-getDfsUsed();
 long available = usage.getAvailable();
 if (remaining  available) {
   remaining = available;
 }
 return (remaining  0) ? remaining : 0;
   } 
 {code}
 Here we are not considering the reserved space while getting the Available 
 Space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7999) FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a very long time

2015-03-31 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14388080#comment-14388080
 ] 

zhouyingchao commented on HDFS-7999:


Thank you for looking into the patch.  Here is some explain of the logic of 
createTemporary() after applying the patch:
1.  If there is no ReplicaInfo in volumeMap for the passed in ExtendedBlock b, 
then we will create one, insert into volumeMap and then return from line 1443.
2.  If there is a ReplicaInfo in volumeMap and its GS is newer than the passed 
in ExtendedBlock b, then throw the ReplicaAlreadyExistsException from line 1447.
3.  If there is a ReplicaInfo in volumeMap whereas its GS is older than the 
passed in ExtendedBlock b, then it means this is a new write and the earlier 
writer should be stopped.  We will release the FsDatasetImpl lock and try to 
stop the earlier writer w/o the lock.  
4.  After the earlier writer is stopped, we need to evict earlier writer's 
ReplicaInfo from volumeMap, to that end we will re-acquire the FsDatasetImpl 
lock.  However,  since this thread has released the FsDatasetImpl lock when it 
tried to stop earlier writer, another thread might have come in and changed the 
ReplicaInfo of this block in VolumeMap.  This situation is not very likely to 
happen whereas we have to handle it in case.   The loop in the patch is just 
tried to handle this situation -- after re-acuire the FsDatasetImpl lock, it 
will check if the current ReplicaInfo in volumeMap is still the one before we 
stop the writer, if so we can simply evict it and create/insert a new one then 
return from line 1443. Otherwise, it implies another thread has slipped in and 
changed the ReplicaInfo when we were stopping earlier writer.  In this 
condition, we check if that thread has inserted a block with even newer GS than 
us, if so we throws ReplicaAlreadyExistsException from line 1447. Otherwise we 
need to stop that thread's write just like we stop the earlier writer in step 3.


 FsDatasetImpl#createTemporary sometimes holds the FSDatasetImpl lock for a 
 very long time
 -

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 

[jira] [Commented] (HDFS-7999) DN Hearbeat is blocked by waiting FsDatasetImpl lock

2015-03-30 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386528#comment-14386528
 ] 

zhouyingchao commented on HDFS-7999:


Hi Xinwei
Thank you for sharing the status regarding HDFS-7060. I think it is the right 
way to fix the heartbeat issue.  Saying that, I still think the patch here is 
necessary - current implementation of createTemporary() might sleep up to 60s 
with a lock held, it does not make sense, right?  It might block other threads 
besides heartbeat for a long time any way.  
Comments?  Thoughts?

 DN Hearbeat is blocked by waiting FsDatasetImpl lock
 

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7999) DN Hearbeat is blocked by waiting FsDatasetImpl lock

2015-03-30 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386536#comment-14386536
 ] 

zhouyingchao commented on HDFS-7999:


I tested TestBalancer with the patch on my rig, it can pass.  From the log of 
the failure, it looks like not related to the patch.

 DN Hearbeat is blocked by waiting FsDatasetImpl lock
 

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7997) The first non-existing xattr should also throw IOException

2015-03-29 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14386205#comment-14386205
 ] 

zhouyingchao commented on HDFS-7997:


Thank you for pointing it out.  Actually we are using the name of user.xxx, 
the pseudo code snippet here is just used to explain the issue.

 The first non-existing xattr should also throw IOException
 --

 Key: HDFS-7997
 URL: https://issues.apache.org/jira/browse/HDFS-7997
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
Priority: Minor
 Attachments: HDFS-7997-001.patch


 We use the following code snippet to get/set xattrs. However, if there are no 
 xattrs have ever been set, the first getXAttr returns null and the second one 
 just throws exception with message like At least one of the attributes 
 provided was not found..  This is not expected, we believe they should 
 behave in the same way - i.e either both getXAttr returns null or both 
 getXAttr throw exception with the message ... not found.  We will provide a 
 patch to make them both throw exception.
 
 attrValueNM = fs.getXAttr(path, nm);
 if (attrValueNM == null) {
  fs.setXAttr(nm, DEFAULT_VALUE);
 }
 attrValueNN = fs.getXAttr(path, nn);
 if (attrValueNN == null) {
 fs.setXAttr(nn, DEFAULT_VALUE);
 }
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7997) The first non-existing xattr should also throw IOException

2015-03-27 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7997:
---
Affects Version/s: 2.6.0
   Status: Patch Available  (was: Open)

Test with  
-Dtest=TestXAttrCLI,TestXAttrWithSnapshot,TestNameNodeXAttr,TestFileContextXAttr,FSXAttrBaseTest,TestFSImageWithXAttr,TestXAttrConfigFlag,TestXAttrsWithHA,TestWebHDFSXAttr,TestXAttr,TestViewFileSystemWithXAttrs,TestViewFsWithXAttrs
 

 The first non-existing xattr should also throw IOException
 --

 Key: HDFS-7997
 URL: https://issues.apache.org/jira/browse/HDFS-7997
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 We use the following code snippet to get/set xattrs. However, if there are no 
 xattrs have ever been set, the first getXAttr returns null and the second one 
 just throws exception with message like At least one of the attributes 
 provided was not found..  This is not expected, we believe they should 
 behave in the same way - i.e either both getXAttr returns null or both 
 getXAttr throw exception with the message ... not found.  We will provide a 
 patch to make them both throw exception.
 
 attrValueNM = fs.getXAttr(path, nm);
 if (attrValueNM == null) {
  fs.setXAttr(nm, DEFAULT_VALUE);
 }
 attrValueNN = fs.getXAttr(path, nn);
 if (attrValueNN == null) {
 fs.setXAttr(nn, DEFAULT_VALUE);
 }
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7212) Huge number of BLOCKED threads rendering DataNodes useless

2015-03-27 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383655#comment-14383655
 ] 

zhouyingchao commented on HDFS-7212:


I'll create a JIRA for the issue I met and submit a patch.

 Huge number of BLOCKED threads rendering DataNodes useless
 --

 Key: HDFS-7212
 URL: https://issues.apache.org/jira/browse/HDFS-7212
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
 Environment: PROD
Reporter: Istvan Szukacs

 There are 3000 - 8000 threads in each datanode JVM, blocking the entire VM 
 and rendering the service unusable, missing heartbeats and stopping data 
 access. The threads look like this:
 {code}
 3415 (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may 
 be imprecise)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
 line=186 (Compiled frame)
 - 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
 @bci=1, line=834 (Interpreted frame)
 - 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node,
  int) @bci=67, line=867 (Interpreted frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, 
 line=1197 (Interpreted frame)
 - java.util.concurrent.locks.ReentrantLock$NonfairSync.lock() @bci=21, 
 line=214 (Compiled frame)
 - java.util.concurrent.locks.ReentrantLock.lock() @bci=4, line=290 (Compiled 
 frame)
 - 
 org.apache.hadoop.net.unix.DomainSocketWatcher.add(org.apache.hadoop.net.unix.DomainSocket,
  org.apache.hadoop.net.unix.DomainSocketWatcher$Handler) @bci=4, line=286 
 (Interpreted frame)
 - 
 org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(java.lang.String,
  org.apache.hadoop.net.unix.DomainSocket) @bci=169, line=283 (Interpreted 
 frame)
 - 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(java.lang.String)
  @bci=212, line=413 (Interpreted frame)
 - 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(java.io.DataInputStream)
  @bci=13, line=172 (Interpreted frame)
 - 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(org.apache.hadoop.hdfs.protocol.datatransfer.Op)
  @bci=149, line=92 (Compiled frame)
 - org.apache.hadoop.hdfs.server.datanode.DataXceiver.run() @bci=510, line=232 
 (Compiled frame)
 - java.lang.Thread.run() @bci=11, line=744 (Interpreted frame)
 {code}
 Has anybody seen this before?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7999) DN Hearbeat is blocked by waiting FsDatasetImpl lock

2015-03-27 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-7999:
--

 Summary: DN Hearbeat is blocked by waiting FsDatasetImpl lock
 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao


I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for very 
long time, say more than 100 seconds. I get the jstack twice and looks like 
they are all blocked (at getStorageReport) by dataset lock, and which is held 
by a thread that is calling createTemporary, which again is blocked to wait 
earlier incarnation writer to exit.

The heartbeat thread stack:
   java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
- waiting to lock 0x0007b01428c0 (a 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
- locked 0x0007b0140ed0 (a java.lang.Object)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
at java.lang.Thread.run(Thread.java:662)

The DataXceiver thread holds the dataset lock:
DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
nid=0x52bc in Object.wait() [0x7f11d78f7000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1194)
locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
at 
org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
locked 0x0007b01428c0 (a 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
at 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
at 
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
at 
org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at 
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7999) DN Hearbeat is blocked by waiting FsDatasetImpl lock

2015-03-27 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao reassigned HDFS-7999:
--

Assignee: zhouyingchao

 DN Hearbeat is blocked by waiting FsDatasetImpl lock
 

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7212) Huge number of BLOCKED threads rendering DataNodes useless

2015-03-27 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383654#comment-14383654
 ] 

zhouyingchao commented on HDFS-7212:


I'll create a JIRA for the issue I met and submit a patch.

 Huge number of BLOCKED threads rendering DataNodes useless
 --

 Key: HDFS-7212
 URL: https://issues.apache.org/jira/browse/HDFS-7212
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.4.0
 Environment: PROD
Reporter: Istvan Szukacs

 There are 3000 - 8000 threads in each datanode JVM, blocking the entire VM 
 and rendering the service unusable, missing heartbeats and stopping data 
 access. The threads look like this:
 {code}
 3415 (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may 
 be imprecise)
 - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, 
 line=186 (Compiled frame)
 - 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt() 
 @bci=1, line=834 (Interpreted frame)
 - 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.util.concurrent.locks.AbstractQueuedSynchronizer$Node,
  int) @bci=67, line=867 (Interpreted frame)
 - java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(int) @bci=17, 
 line=1197 (Interpreted frame)
 - java.util.concurrent.locks.ReentrantLock$NonfairSync.lock() @bci=21, 
 line=214 (Compiled frame)
 - java.util.concurrent.locks.ReentrantLock.lock() @bci=4, line=290 (Compiled 
 frame)
 - 
 org.apache.hadoop.net.unix.DomainSocketWatcher.add(org.apache.hadoop.net.unix.DomainSocket,
  org.apache.hadoop.net.unix.DomainSocketWatcher$Handler) @bci=4, line=286 
 (Interpreted frame)
 - 
 org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.createNewMemorySegment(java.lang.String,
  org.apache.hadoop.net.unix.DomainSocket) @bci=169, line=283 (Interpreted 
 frame)
 - 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitShm(java.lang.String)
  @bci=212, line=413 (Interpreted frame)
 - 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitShm(java.io.DataInputStream)
  @bci=13, line=172 (Interpreted frame)
 - 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(org.apache.hadoop.hdfs.protocol.datatransfer.Op)
  @bci=149, line=92 (Compiled frame)
 - org.apache.hadoop.hdfs.server.datanode.DataXceiver.run() @bci=510, line=232 
 (Compiled frame)
 - java.lang.Thread.run() @bci=11, line=744 (Interpreted frame)
 {code}
 Has anybody seen this before?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7999) DN Hearbeat is blocked by waiting FsDatasetImpl lock

2015-03-27 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7999:
---
Attachment: HDFS-7999-001.patch

Test with 
-Dtest=FsDatasetTestUtil,LazyPersistTestCase,TestDatanodeRestart,TestFsDatasetImpl,TestFsVolumeList,TestInterDatanodeProtocol,TestLazyPersistFiles,TestRbwSpaceReservation,TestReplicaMap,TestScrLazyPersistFiles,TestWriteToReplica

 DN Hearbeat is blocked by waiting FsDatasetImpl lock
 

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7999) DN Hearbeat is blocked by waiting FsDatasetImpl lock

2015-03-27 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7999:
---
Status: Patch Available  (was: Open)

The fix is to call stopWriter w/o the FsDatasetImpl lock. However, without 
lock, another thread may slip in and inject another ReplicaInfo to the map when 
we stop the writter. To resolve the issue, we will try to invalidate stale 
replica in a loop.  As the last resort, if we hang in the thread too long, we 
will bail out the loop with an IOException.

 DN Hearbeat is blocked by waiting FsDatasetImpl lock
 

 Key: HDFS-7999
 URL: https://issues.apache.org/jira/browse/HDFS-7999
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7999-001.patch


 I'm using 2.6.0 and noticed that sometime DN's heartbeat were delayed for 
 very long time, say more than 100 seconds. I get the jstack twice and looks 
 like they are all blocked (at getStorageReport) by dataset lock, and which is 
 held by a thread that is calling createTemporary, which again is blocked to 
 wait earlier incarnation writer to exit.
 The heartbeat thread stack:
java.lang.Thread.State: BLOCKED (on object monitor)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getDfsUsed(FsVolumeImpl.java:152)
 - waiting to lock 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getStorageReports(FsDatasetImpl.java:144)
 - locked 0x0007b0140ed0 (a java.lang.Object)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:575)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:680)
 at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:850)
 at java.lang.Thread.run(Thread.java:662)
 The DataXceiver thread holds the dataset lock:
 DataXceiver for client at X daemon prio=10 tid=0x7f14041e6480 
 nid=0x52bc in Object.wait() [0x7f11d78f7000]
 java.lang.Thread.State: TIMED_WAITING (on object monitor)
 at java.lang.Object.wait(Native Method)
 at java.lang.Thread.join(Thread.java:1194)
 locked 0x0007a33b85d8 (a org.apache.hadoop.util.Daemon)
 at 
 org.apache.hadoop.hdfs.server.datanode.ReplicaInPipeline.stopWriter(ReplicaInPipeline.java:183)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:1231)
 locked 0x0007b01428c0 (a 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl)
 at 
 org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createTemporary(FsDatasetImpl.java:114)
 at 
 org.apache.hadoop.hdfs.server.datanode.BlockReceiver.init(BlockReceiver.java:179)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
 at 
 org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
 at 
 org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235)
 at java.lang.Thread.run(Thread.java:662)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7997) The first non-existing xattr should also throw IOException

2015-03-27 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-7997:
--

 Summary: The first non-existing xattr should also throw IOException
 Key: HDFS-7997
 URL: https://issues.apache.org/jira/browse/HDFS-7997
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhouyingchao
Assignee: zhouyingchao


We use the following code snippet to get/set xattrs. However, if there are no 
xattrs have ever been set, the first getXAttr returns null and the second one 
just throws exception with message like At least one of the attributes 
provided was not found..  This is not expected, we believe they should behave 
in the same way - i.e either both getXAttr returns null or both getXAttr throw 
exception with the message ... not found.  We will provide a patch to make 
them both throw exception.


attrValueNM = fs.getXAttr(path, nm);
if (attrValueNM == null) {
 fs.setXAttr(nm, DEFAULT_VALUE);
}
attrValueNN = fs.getXAttr(path, nn);
if (attrValueNN == null) {
fs.setXAttr(nn, DEFAULT_VALUE);
}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7997) The first non-existing xattr should also throw IOException

2015-03-27 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7997:
---
Attachment: HDFS-7997-001.patch

 The first non-existing xattr should also throw IOException
 --

 Key: HDFS-7997
 URL: https://issues.apache.org/jira/browse/HDFS-7997
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7997-001.patch


 We use the following code snippet to get/set xattrs. However, if there are no 
 xattrs have ever been set, the first getXAttr returns null and the second one 
 just throws exception with message like At least one of the attributes 
 provided was not found..  This is not expected, we believe they should 
 behave in the same way - i.e either both getXAttr returns null or both 
 getXAttr throw exception with the message ... not found.  We will provide a 
 patch to make them both throw exception.
 
 attrValueNM = fs.getXAttr(path, nm);
 if (attrValueNM == null) {
  fs.setXAttr(nm, DEFAULT_VALUE);
 }
 attrValueNN = fs.getXAttr(path, nn);
 if (attrValueNN == null) {
 fs.setXAttr(nn, DEFAULT_VALUE);
 }
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7997) The first non-existing xattr should also throw IOException

2015-03-27 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14383708#comment-14383708
 ] 

zhouyingchao commented on HDFS-7997:


The failed test case is not related to the change.  I just verified that it 
cannot pass without the xattr changes.

 The first non-existing xattr should also throw IOException
 --

 Key: HDFS-7997
 URL: https://issues.apache.org/jira/browse/HDFS-7997
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7997-001.patch


 We use the following code snippet to get/set xattrs. However, if there are no 
 xattrs have ever been set, the first getXAttr returns null and the second one 
 just throws exception with message like At least one of the attributes 
 provided was not found..  This is not expected, we believe they should 
 behave in the same way - i.e either both getXAttr returns null or both 
 getXAttr throw exception with the message ... not found.  We will provide a 
 patch to make them both throw exception.
 
 attrValueNM = fs.getXAttr(path, nm);
 if (attrValueNM == null) {
  fs.setXAttr(nm, DEFAULT_VALUE);
 }
 attrValueNN = fs.getXAttr(path, nn);
 if (attrValueNN == null) {
 fs.setXAttr(nn, DEFAULT_VALUE);
 }
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-05 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14349744#comment-14349744
 ] 

zhouyingchao commented on HDFS-7868:


Looks like there are some issues with the jekins system?
I'll resubmit the patch to kick the build again.

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7868-001.patch, HDFS-7868-002.patch, 
 HDFS-7868-003.patch


 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Status: Open  (was: Patch Available)

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Attachment: (was: HDFS-7868-004.patch)

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Attachment: (was: HDFS-7868-003.patch)

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Attachment: (was: HDFS-7868-002.patch)

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Attachment: (was: HDFS-7868-001.patch)

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7897) Shutdown metrics when stopping JournalNode

2015-03-05 Thread zhouyingchao (JIRA)
zhouyingchao created HDFS-7897:
--

 Summary: Shutdown metrics when stopping JournalNode
 Key: HDFS-7897
 URL: https://issues.apache.org/jira/browse/HDFS-7897
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhouyingchao


In JournalNode.stop(), the metrics system is forgotten to shutdown. The issue 
is found when reading the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Attachment: HDFS-7868-004.patch

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7868-001.patch, HDFS-7868-002.patch, 
 HDFS-7868-003.patch, HDFS-7868-004.patch


 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Attachment: HDFS-7868-002.patch

The new patch which can pass earlier failed cases on my computer.

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7868-001.patch, HDFS-7868-002.patch


 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Attachment: HDFS-7868-003.patch

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7868-001.patch, HDFS-7868-002.patch, 
 HDFS-7868-003.patch


 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7897) Shutdown metrics when stopping JournalNode

2015-03-05 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14350003#comment-14350003
 ] 

zhouyingchao commented on HDFS-7897:


I tried the patch on my computer with the two failed case reported by robot, 
both of them finish successfully.  The failure should not be related with the 
patch.

 Shutdown metrics when stopping JournalNode
 --

 Key: HDFS-7897
 URL: https://issues.apache.org/jira/browse/HDFS-7897
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7897-001.patch


 In JournalNode.stop(), the metrics system is forgotten to shutdown. The issue 
 is found when reading the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Attachment: HDFS-7868-001.patch

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7868-001.patch


 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Status: Patch Available  (was: Open)

Not sure what's wrong with the Jekins build system. I just cancel and deleted 
all attachment and retry to submit the patch.

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7897) Shutdown metrics when stopping JournalNode

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7897:
---
Affects Version/s: 2.6.0
   Status: Patch Available  (was: Open)

 Shutdown metrics when stopping JournalNode
 --

 Key: HDFS-7897
 URL: https://issues.apache.org/jira/browse/HDFS-7897
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao

 In JournalNode.stop(), the metrics system is forgotten to shutdown. The issue 
 is found when reading the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7897) Shutdown metrics when stopping JournalNode

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7897:
---
Attachment: HDFS-7897-001.patch

 Shutdown metrics when stopping JournalNode
 --

 Key: HDFS-7897
 URL: https://issues.apache.org/jira/browse/HDFS-7897
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7897-001.patch


 In JournalNode.stop(), the metrics system is forgotten to shutdown. The issue 
 is found when reading the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-7897) Shutdown metrics when stopping JournalNode

2015-03-05 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao reassigned HDFS-7897:
--

Assignee: zhouyingchao

 Shutdown metrics when stopping JournalNode
 --

 Key: HDFS-7897
 URL: https://issues.apache.org/jira/browse/HDFS-7897
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhouyingchao
Assignee: zhouyingchao

 In JournalNode.stop(), the metrics system is forgotten to shutdown. The issue 
 is found when reading the code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-02 Thread zhouyingchao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14344208#comment-14344208
 ] 

zhouyingchao commented on HDFS-7868:


I'll investigate the failure a few days after finish some other tasks.

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7868-001.patch


 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-02 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Attachment: HDFS-7868-001.patch

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7868-001.patch


 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-7868) Use proper blocksize to choose target for blocks

2015-03-02 Thread zhouyingchao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

zhouyingchao updated HDFS-7868:
---
Affects Version/s: 2.6.0
   Status: Patch Available  (was: Open)

 Use proper blocksize to choose target for blocks
 

 Key: HDFS-7868
 URL: https://issues.apache.org/jira/browse/HDFS-7868
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.6.0
Reporter: zhouyingchao
Assignee: zhouyingchao
 Attachments: HDFS-7868-001.patch


 In BlockPlacementPolicyDefault.java:isGoodTarget, the passed-in blockSize is 
 used to determine if there is enough room for a new block on a data node. 
 However, in two conditions the blockSize might not be proper for the purpose: 
 (a) the passed in block size is just the size of the last block of a file, 
 which might be very small (for e.g., called from 
 BlockManager.ReplicationWork.chooseTargets). (b) A file which might be 
 created with a smaller blocksize.
 In these conditions, the calculated scheduledSize might be smaller than the 
 actual value, which finally might lead to following failure of writing or 
 replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >