[jira] [Commented] (HDFS-15792) ClasscastException while loading FSImage

2021-02-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278625#comment-17278625
 ] 

Hadoop QA commented on HDFS-15792:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  2m  
9s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} branch-2.10 Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
33s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK 
Oracle Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~16.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK 
Oracle Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green}{color} | {color:green} branch-2.10 passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~16.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m 
23s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green}{color} | {color:green} branch-2.10 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Oracle Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~16.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
45s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 27s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/459/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt{color}
 | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 3 new + 
12 unchanged - 0 fixed = 15 total (was 12) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Oracle Corporation-1.7.0_95-b00 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~16.04-b01 {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
26s{color} | 

[jira] [Work logged] (HDFS-15683) Allow configuring DISK/ARCHIVE capacity for individual volumes

2021-02-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15683?focusedWorklogId=547473=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547473
 ]

ASF GitHub Bot logged work on HDFS-15683:
-

Author: ASF GitHub Bot
Created on: 04/Feb/21 07:36
Start Date: 04/Feb/21 07:36
Worklog Time Spent: 10m 
  Work Description: Jing9 commented on a change in pull request #2625:
URL: https://github.com/apache/hadoop/pull/2625#discussion_r57126



##
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
##
@@ -815,6 +815,25 @@ public IOException call() {
 String.format("FAILED TO ADD: %s: %s%n",
 volume, ioe.getMessage()));
 LOG.error("Failed to add volume: {}", volume, ioe);
+/**
+ * TODO: Some cases are not supported yet with
+ *   same-disk-tiering on. For example, when replacing a
+ *   storage directory on same mount, we have check if same
+ *   storage type already exists on the mount. In this case
+ *   we need to remove existing vol first then add.
+ *   Also, we will need to adjust new capacity ratio when
+ *   refreshVolume in the future.
+ */
+if (ioe.getMessage()

Review comment:
   It may be better to check if there is conflict with the 
same-disk-tiering feature when we first load the refreshVolume configuration. 
I.e., we can do some verification on changedVolumes





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 547473)
Time Spent: 2.5h  (was: 2h 20m)

> Allow configuring DISK/ARCHIVE capacity for individual volumes
> --
>
> Key: HDFS-15683
> URL: https://issues.apache.org/jira/browse/HDFS-15683
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode
>Reporter: Leon Gao
>Assignee: Leon Gao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> This is a follow-up task for https://issues.apache.org/jira/browse/HDFS-15548
> In case that the datanode disks are not unified, we should allow admins to 
> configure capacity for individual volumes on top of the default one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15792) ClasscastException while loading FSImage

2021-02-03 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278580#comment-17278580
 ] 

Renukaprasad C commented on HDFS-15792:
---

Sure, Thanks [~hexiaoqiao] for your time & support. 

> ClasscastException while loading FSImage
> 
>
> Key: HDFS-15792
> URL: https://issues.apache.org/jira/browse/HDFS-15792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15792-branch-2.10.001.patch, 
> HDFS-15792-branch-2.10.002.patch, HDFS-15792.001.patch, HDFS-15792.002.patch, 
> HDFS-15792.003.patch, HDFS-15792.004.patch, HDFS-15792.005.patch, 
> HDFS-15792.addendum.001.patch, image-2021-01-27-12-00-34-846.png
>
>
> FSImage loading has failed with ClasscastException - 
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to 
> java.util.HashMap$TreeNode.
> This is the usage issue with Hashmap in concurrent scenarios.
> Same issue has been reported on Java & closed as usage issue.  - 
> https://bugs.openjdk.java.net/browse/JDK-8173671
> 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading 
> INODE from fsiamge. | FSImageFormatProtobuf.java:442
> java.lang.
> : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
>   at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
>   at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
>   at java.util.HashMap.treeifyBin(HashMap.java:772)
>   at java.util.HashMap.putVal(HashMap.java:644)
>   at java.util.HashMap.put(HashMap.java:612)
>   at 
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from 
> FSImageFile(file=/srv/BigData/namenode/current/fsimage_00198227480, 
> cpktTxId=00198227480) | FSImage.java:738
> java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node 
> cannot be cast to java.util.HashMap$TreeNode
>   at 
> org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1113)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:730)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:648)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:710)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:953)
>   at 
> 

[jira] [Commented] (HDFS-15812) after deleting data of hbase table hdfs size is not decreasing

2021-02-03 Thread Satya Gaurav (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278578#comment-17278578
 ] 

Satya Gaurav commented on HDFS-15812:
-

[~surendralilhore] We have created the table  with the value of  
KEEP_DELETED_CELLS =false .

So ideally it should be delete the data from HDFS also and snapshot is also not 
enable.

We are just creating the tables and putting the data and after that we are 
deleting the data of that column family.

The no of count has reduced in hbase table but when we have checked the hdfs 
dircetory for the hbase the size is same even we ran major compaction also but 
no impact on hdfs size.

 

Below is the query which we are using

create 'OBST:TEST',

{NAME => 'cfDocContent', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', 
NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', 
CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'FAST_DIFF', TTL => 
'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', 
CACHE_INDEX_ON_WRITE=> 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 
'false', PREFETCH_BLOCKS_ON_OPEN => 'false', IS_MOB => 'true', COMPRESSION => 
'SNAPPY', BLOCKCACHE=> 'true', BLOCKSIZE => '65536'}

,

{NAME => 'cfMetadata', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', 
NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', 
CACHE_DATA_ON_WRITE=> 'false', DATA_BLOCK_ENCODING => 'FAST_DIFF', TTL => 
'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', 
CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 
'false', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'SNAPPY', 
BLOCKCACHE => 'true', BLOCKSIZE => '65536'}

put 'OBST:TEST','Test101','cfDocContent:doc','row101'
put 'OBST:TEST','Test102','cfDocContent:doc','row102'
put 'OBST:TEST','Test103','cfDocContent:doc','row103'

put 'OBST:TEST','Test106','cfMetadata:doc','row106'
put 'OBST:TEST','Test107','cfMetadata:doc','row107'
put 'OBST:TEST','Test108','cfMetadata:doc','row108'

 

delete 'OBST:TEST','Test101','cfDocContent:doc'
delete 'OBST:TEST','Test102','cfDocContent:doc'
delete 'OBST:TEST','Test103','cfDocContent:doc'

 

delete 'OBST:TEST','Test106','cfMetadata:doc'
delete 'OBST:TEST','Test107','cfMetadata:doc'
delete 'OBST:TEST','Test108','cfMetadata:doc'

 

flush 'OBST:TEST'
compact 'OBST:TEST'
major_compact 'OBST:TEST', 'cfDocContent', 'MOB'
major_compact 'OBST:TEST

 

_hbase(main):185:0>_ scan 'OBST:TEST', \{RAW => TRUE}
_ROW COLUMN+CELL_
_Test101 column=cfDocContent:doc, timestamp=1608282551809, type=Delete_
_Test102 column=cfDocContent:doc, timestamp=1608282551824, type=Delete_
_Test103 column=cfDocContent:doc, timestamp=1608282551838, type=Delete_

 

_Even the tombstone mark is also there and the size of hdfs directory didn't 
reduce. It's not moving in to trash also._

 

_Let me know if you need some additional information_ 

 

Regards,

Satya

> after deleting data of hbase table hdfs size is not decreasing
> --
>
> Key: HDFS-15812
> URL: https://issues.apache.org/jira/browse/HDFS-15812
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.0.2-alpha
> Environment: HDP 3.1.4.0-315
> Hbase 2.0.2.3.1.4.0-315
>Reporter: Satya Gaurav
>Priority: Major
>
> I am deleting the data from hbase table, it's deleting from hbase table but 
> the size of the hdfs directory is not reducing. Even I ran the major 
> compaction but after that also hdfs size didn't reduce. Any solution for this 
> issue?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType

2021-02-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-15819:
--
Labels: pull-request-available  (was: )

> Fix a codestyle issue for TestQuotaByStorageType
> 
>
> Key: HDFS-15819
> URL: https://issues.apache.org/jira/browse/HDFS-15819
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Baolong Mao
>Assignee: Baolong Mao
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType

2021-02-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15819?focusedWorklogId=547454=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547454
 ]

ASF GitHub Bot logged work on HDFS-15819:
-

Author: ASF GitHub Bot
Created on: 04/Feb/21 06:31
Start Date: 04/Feb/21 06:31
Worklog Time Spent: 10m 
  Work Description: maobaolong opened a new pull request #2681:
URL: https://github.com/apache/hadoop/pull/2681


   https://issues.apache.org/jira/browse/HDFS-15819



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 547454)
Remaining Estimate: 0h
Time Spent: 10m

> Fix a codestyle issue for TestQuotaByStorageType
> 
>
> Key: HDFS-15819
> URL: https://issues.apache.org/jira/browse/HDFS-15819
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Baolong Mao
>Assignee: Baolong Mao
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15819) Fix a codestyle issue for TestQuotaByStorageType

2021-02-03 Thread Baolong Mao (Jira)
Baolong Mao created HDFS-15819:
--

 Summary: Fix a codestyle issue for TestQuotaByStorageType
 Key: HDFS-15819
 URL: https://issues.apache.org/jira/browse/HDFS-15819
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs
Affects Versions: 3.4.0
Reporter: Baolong Mao
Assignee: Baolong Mao






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15816) If NO stale node in last choosing, the chooseTarget don't need to retry with stale nodes.

2021-02-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278570#comment-17278570
 ] 

Hadoop QA commented on HDFS-15816:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
45s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
23s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
14s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
20s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 19s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
28s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m 
58s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
55s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
10s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
12s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
12s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
5s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
16s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 35s{color} | {color:green}{color} | {color:green} patch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
22s{color} | {color:green}{color} | {color:green} the 

[jira] [Commented] (HDFS-15792) ClasscastException while loading FSImage

2021-02-03 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278556#comment-17278556
 ] 

Xiaoqiao He commented on HDFS-15792:


[~prasad-acit], maybe it is related with YARN-10036, Aajisaka try to fix it. 
Will try to trigger Yetus when backport.

> ClasscastException while loading FSImage
> 
>
> Key: HDFS-15792
> URL: https://issues.apache.org/jira/browse/HDFS-15792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15792-branch-2.10.001.patch, 
> HDFS-15792-branch-2.10.002.patch, HDFS-15792.001.patch, HDFS-15792.002.patch, 
> HDFS-15792.003.patch, HDFS-15792.004.patch, HDFS-15792.005.patch, 
> HDFS-15792.addendum.001.patch, image-2021-01-27-12-00-34-846.png
>
>
> FSImage loading has failed with ClasscastException - 
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to 
> java.util.HashMap$TreeNode.
> This is the usage issue with Hashmap in concurrent scenarios.
> Same issue has been reported on Java & closed as usage issue.  - 
> https://bugs.openjdk.java.net/browse/JDK-8173671
> 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading 
> INODE from fsiamge. | FSImageFormatProtobuf.java:442
> java.lang.
> : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
>   at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
>   at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
>   at java.util.HashMap.treeifyBin(HashMap.java:772)
>   at java.util.HashMap.putVal(HashMap.java:644)
>   at java.util.HashMap.put(HashMap.java:612)
>   at 
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from 
> FSImageFile(file=/srv/BigData/namenode/current/fsimage_00198227480, 
> cpktTxId=00198227480) | FSImage.java:738
> java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node 
> cannot be cast to java.util.HashMap$TreeNode
>   at 
> org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1113)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:730)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:648)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:710)
>   at 
> 

[jira] [Commented] (HDFS-15812) after deleting data of hbase table hdfs size is not decreasing

2021-02-03 Thread Surendra Singh Lilhore (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278552#comment-17278552
 ] 

Surendra Singh Lilhore commented on HDFS-15812:
---

[~satycse06], This doc may help to understand. You need to check HBase side 
deletion policy. 

[https://docs.cloudera.com/cdp-private-cloud-base/7.1.3/managing-hbase/topics/hbase-deletion.html]

I don't see any problem from HDFS side.

> after deleting data of hbase table hdfs size is not decreasing
> --
>
> Key: HDFS-15812
> URL: https://issues.apache.org/jira/browse/HDFS-15812
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.0.2-alpha
> Environment: HDP 3.1.4.0-315
> Hbase 2.0.2.3.1.4.0-315
>Reporter: Satya Gaurav
>Priority: Major
>
> I am deleting the data from hbase table, it's deleting from hbase table but 
> the size of the hdfs directory is not reducing. Even I ran the major 
> compaction but after that also hdfs size didn't reduce. Any solution for this 
> issue?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15805) Hadoop prints sensitive Cookie information.

2021-02-03 Thread Takanobu Asanuma (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takanobu Asanuma updated HDFS-15805:

Target Version/s: 3.3.0, 3.1.1  (was: 3.1.1, 3.3.0)
  Resolution: Duplicate
  Status: Resolved  (was: Patch Available)

Thanks for your report, [~prasad-acit]. I closed this issue.

BTW, You didn't need to create a new jira. You could convert this jira to 
Hadoop common.

> Hadoop prints sensitive Cookie information.
> ---
>
> Key: HDFS-15805
> URL: https://issues.apache.org/jira/browse/HDFS-15805
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Attachments: HDFS-15805.001.patch
>
>
> org.apache.hadoop.security.authentication.client.AuthenticatedURL.AuthCookieHandler#setAuthCookie
>  - prints cookie information in log. Any sensitive infomation in Cookies will 
> be logged, which needs to be avaided.
> LOG.trace("Setting token value to {} ({})", authCookie, oldCookie);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278523#comment-17278523
 ] 

Hadoop QA commented on HDFS-15813:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  1m 
46s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} No case conflicting files 
found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green}{color} | {color:green} The patch does not contain any 
@author tags. {color} |
| {color:green}+1{color} | {color:green} {color} | {color:green}  0m  0s{color} 
| {color:green}test4tests{color} | {color:green} The patch appears to include 1 
new or modified test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
53s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
 2s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m 
24s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
46s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
12s{color} | {color:green}{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 47s{color} | {color:green}{color} | {color:green} branch has no errors when 
building and testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
40s{color} | {color:green}{color} | {color:green} trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
55s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private 
Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
16s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs 
config; considering switching to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
43s{color} | {color:green}{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
44s{color} | {color:blue}{color} | {color:blue} Maven dependency ordering for 
patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 2s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  5m  
4s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  5m  
4s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
40s{color} | {color:green}{color} | {color:green} the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
40s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m 10s{color} | 
{color:orange}https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/457/artifact/out/diff-checkstyle-hadoop-hdfs-project.txt{color}
 | {color:orange} hadoop-hdfs-project: The patch generated 1 new + 86 unchanged 
- 1 fixed = 87 total (was 87) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
6s{color} | {color:green}{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace 
issues. {color} |
| 

[jira] [Work logged] (HDFS-15818) Fix TestFsDatasetImpl.testReadLockCanBeDisabledByConfig

2021-02-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15818?focusedWorklogId=547420=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547420
 ]

ASF GitHub Bot logged work on HDFS-15818:
-

Author: ASF GitHub Bot
Created on: 04/Feb/21 04:19
Start Date: 04/Feb/21 04:19
Worklog Time Spent: 10m 
  Work Description: LeonGao91 opened a new pull request #2679:
URL: https://github.com/apache/hadoop/pull/2679


   ## NOTICE
   
   Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HADOOP-X. Fix a typo in YYY.)
   For more details, please see 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 547420)
Remaining Estimate: 0h
Time Spent: 10m

> Fix TestFsDatasetImpl.testReadLockCanBeDisabledByConfig
> ---
>
> Key: HDFS-15818
> URL: https://issues.apache.org/jira/browse/HDFS-15818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Leon Gao
>Assignee: Leon Gao
>Priority: Minor
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Current TestFsDatasetImpl.testReadLockCanBeDisabledByConfig is incorrect:
> 1) Test fails intermittently as holder can acquire lock first
> [https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2666/1/testReport/]
>  
> 2) Test passes regardless of the setting of 
> DFS_DATANODE_LOCK_READ_WRITE_ENABLED_KEY



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15818) Fix TestFsDatasetImpl.testReadLockCanBeDisabledByConfig

2021-02-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-15818:
--
Labels: pull-request-available  (was: )

> Fix TestFsDatasetImpl.testReadLockCanBeDisabledByConfig
> ---
>
> Key: HDFS-15818
> URL: https://issues.apache.org/jira/browse/HDFS-15818
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Reporter: Leon Gao
>Assignee: Leon Gao
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Current TestFsDatasetImpl.testReadLockCanBeDisabledByConfig is incorrect:
> 1) Test fails intermittently as holder can acquire lock first
> [https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2666/1/testReport/]
>  
> 2) Test passes regardless of the setting of 
> DFS_DATANODE_LOCK_READ_WRITE_ENABLED_KEY



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15792) ClasscastException while loading FSImage

2021-02-03 Thread Xiaoqiao He (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278506#comment-17278506
 ] 

Xiaoqiao He commented on HDFS-15792:


LGTM. +1 on [^HDFS-15792-branch-2.10.002.patch]. [~sodonnell] would you mind to 
take another review?
Build failed could not related with this patch. I would like to request the 
build failed case to maillist. Will sync when have conclusion.

> ClasscastException while loading FSImage
> 
>
> Key: HDFS-15792
> URL: https://issues.apache.org/jira/browse/HDFS-15792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15792-branch-2.10.001.patch, 
> HDFS-15792-branch-2.10.002.patch, HDFS-15792.001.patch, HDFS-15792.002.patch, 
> HDFS-15792.003.patch, HDFS-15792.004.patch, HDFS-15792.005.patch, 
> HDFS-15792.addendum.001.patch, image-2021-01-27-12-00-34-846.png
>
>
> FSImage loading has failed with ClasscastException - 
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to 
> java.util.HashMap$TreeNode.
> This is the usage issue with Hashmap in concurrent scenarios.
> Same issue has been reported on Java & closed as usage issue.  - 
> https://bugs.openjdk.java.net/browse/JDK-8173671
> 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading 
> INODE from fsiamge. | FSImageFormatProtobuf.java:442
> java.lang.
> : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
>   at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
>   at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
>   at java.util.HashMap.treeifyBin(HashMap.java:772)
>   at java.util.HashMap.putVal(HashMap.java:644)
>   at java.util.HashMap.put(HashMap.java:612)
>   at 
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from 
> FSImageFile(file=/srv/BigData/namenode/current/fsimage_00198227480, 
> cpktTxId=00198227480) | FSImage.java:738
> java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node 
> cannot be cast to java.util.HashMap$TreeNode
>   at 
> org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1113)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:730)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:648)
>   at 
> 

[jira] [Commented] (HDFS-6869) Logfile mode in secure datanode is expected to be 644

2021-02-03 Thread Jason Wen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-6869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278496#comment-17278496
 ] 

Jason Wen commented on HDFS-6869:
-

I have the same question. When using CDH6.3.2 the secure DN log permission is 
600, but when using CDH5.16.2 it is 644.

> Logfile mode in secure datanode is expected to be 644
> -
>
> Key: HDFS-6869
> URL: https://issues.apache.org/jira/browse/HDFS-6869
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Reporter: Guo Ruijing
>Priority: Major
>
> Logfile mode in secure datanode is expected to be 644.
> [root@centos64-2 hadoop-hdfs]# ll
> total 136
> -rw-r--r-- 1 hdfs hadoop 19455 Aug 18 22:40 
> hadoop-hdfs-journalnode-centos64-2.log
> -rw-r--r-- 1 hdfs hadoop   718 Aug 18 22:38 
> hadoop-hdfs-journalnode-centos64-2.out
> -rw-r--r-- 1 hdfs hadoop 62316 Aug 18 22:40 
> hadoop-hdfs-namenode-centos64-2.log
> -rw-r--r-- 1 hdfs hadoop   718 Aug 18 22:38 
> hadoop-hdfs-namenode-centos64-2.out
> -rw-r--r-- 1 hdfs hadoop 15485 Aug 18 22:39 hadoop-hdfs-zkfc-centos64-2.log
> -rw-r--r-- 1 hdfs hadoop   718 Aug 18 22:39 hadoop-hdfs-zkfc-centos64-2.out
> drwxr-xr-x 2 hdfs root4096 Aug 18 22:38 hdfs
> -rw-r--r-- 1 hdfs hadoop 0 Aug 18 22:38 hdfs-audit.log
> -rw-r--r-- 1 hdfs hadoop 15029 Aug 18 22:40 SecurityAuth-hdfs.audit
> [root@centos64-2 hadoop-hdfs]#
> [root@centos64-2 hadoop-hdfs]#
> [root@centos64-2 hadoop-hdfs]# ll hdfs/
> total 52
> -rw--- 1 hdfs hadoop 43526 Aug 18 22:41 
> hadoop-hdfs-datanode-centos64-2.log   <<<   expected -rw-r--r-- (same with 
> namenode log)
> -rw-r--r-- 1 root root 734 Aug 18 22:38 
> hadoop-hdfs-datanode-centos64-2.out
> -rw--- 1 root root 343 Aug 18 22:38 jsvc.err
> -rw--- 1 root root   0 Aug 18 22:38 jsvc.out
> -rw--- 1 hdfs hadoop 0 Aug 18 22:38 SecurityAuth-hdfs.audit



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13609) [Edit Tail Fast Path Pt 3] NameNode-side changes to support tailing edits via RPC

2021-02-03 Thread xuzq (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278488#comment-17278488
 ] 

xuzq commented on HDFS-13609:
-

Thanks [~xkrogen].
{quote}Why JN1 is lagging: You're saying this is happening because JN1 wrote 
some txns to its cache, but not onto disk. Can you elaborate on why this causes 
it to lag?
{quote}
JN1 lagging, because the running JN1 is restarted with wrong 
_dfs.journalnode.edits.dir._

 

On this question, i think there are some bugs :(:
 # Cache is not reflective of what eventually written to disk in Journal.
 # _onlyDurableTxns_ is true in _selectRpcInputStreams_ is not correctness. 
Because maybe here is only have quorum responses, not all journal's response. 
And the first responses may not contain the full edits. It will cause can't 
tail any edits from journal.
 # _onlyDurableTxns_ is true in _editLogTailer.catchupDuringFailover()_, but is 
false in _getFSImage().editLog.openForWrite(getEffectiveLayoutVersion())_ in 
FSNamesystem#startActiveServices(). It maybe caused NameNode crash when 
failover it to active.

 

[~vagarychen] and [~shv],  thanks.

 

> [Edit Tail Fast Path Pt 3] NameNode-side changes to support tailing edits via 
> RPC
> -
>
> Key: HDFS-13609
> URL: https://issues.apache.org/jira/browse/HDFS-13609
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: HDFS-12943, 3.3.0
>
> Attachments: HDFS-13609-HDFS-12943.000.patch, 
> HDFS-13609-HDFS-12943.001.patch, HDFS-13609-HDFS-12943.002.patch, 
> HDFS-13609-HDFS-12943.003.patch, HDFS-13609-HDFS-12943.004.patch
>
>
> See HDFS-13150 for the full design.
> This JIRA is targetted at the NameNode-side changes to enable tailing 
> in-progress edits via the RPC mechanism added in HDFS-13608. Most changes are 
> in the QuorumJournalManager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15779) EC: fix NPE caused by StripedWriter.clearBuffers during reconstruct block

2021-02-03 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278477#comment-17278477
 ] 

Hui Fei commented on HDFS-15779:


[~wanghongbing] Thanks for report and fix.

Commit to trunk, and cherry-pick to branch 3.3,3.2,3.1

> EC: fix NPE caused by StripedWriter.clearBuffers during reconstruct block
> -
>
> Key: HDFS-15779
> URL: https://issues.apache.org/jira/browse/HDFS-15779
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Major
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: HDFS-15779.001.patch, HDFS-15779.002.patch
>
>
> The NullPointerException in DN log as follows: 
> {code:java}
> 2020-12-28 15:49:25,453 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> DatanodeCommand action: DNA_ERASURE_CODING_RECOVERY
> //...
> 2020-12-28 15:51:25,551 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Connection timed out
> 2020-12-28 15:51:25,553 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Failed to reconstruct striped block: 
> BP-1922004198-10.83.xx.xx-1515033360950:blk_-9223372036804064064_6311920695
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedWriter.clearBuffers(StripedWriter.java:299)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.clearBuffers(StripedBlockReconstructor.java:139)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:115)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2020-12-28 15:51:25,749 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Receiving 
> BP-1922004198-10.83.xx.xx-1515033360950:blk_-9223372036799445643_6313197139 
> src: /10.83.xxx.52:53198 dest: /10.83.xxx.52:50
> 010
> {code}
> NPE occurs at `writer.getTargetBuffer()` in codes:
> {code:java}
> // StripedWriter#clearBuffers
> void clearBuffers() {
>   for (StripedBlockWriter writer : writers) {
> ByteBuffer targetBuffer = writer.getTargetBuffer();
> if (targetBuffer != null) {
>   targetBuffer.clear();
> }
>   }
> }
> {code}
> So, why is the writer null? Let's track when the writer is initialized and 
> when reconstruct() is called,  as follows:
> {code:java}
> // StripedBlockReconstructor#run
> public void run() {
>   try {
> initDecoderIfNecessary();
> getStripedReader().init();
> stripedWriter.init();  //①
> reconstruct();  //②
> stripedWriter.endTargetBlocks();
>   } catch (Throwable e) {
> LOG.warn("Failed to reconstruct striped block: {}", getBlockGroup(), e);
> // ...{code}
> They are called at ① and ② above respectively. `stripedWriter.init()` -> 
> `initTargetStreams()`, as follows:
> {code:java}
> // StripedWriter#initTargetStreams
> int initTargetStreams() {
>   int nSuccess = 0;
>   for (short i = 0; i < targets.length; i++) {
> try {
>   writers[i] = createWriter(i);
>   nSuccess++;
>   targetsStatus[i] = true;
> } catch (Throwable e) {
>   LOG.warn(e.getMessage());
> }
>   }
>   return nSuccess;
> }
> {code}
> NPE occurs when createWriter() gets an exception and  0 < nSuccess < 
> targets.length. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15779) EC: fix NPE caused by StripedWriter.clearBuffers during reconstruct block

2021-02-03 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-15779:
---
Fix Version/s: 3.2.3
   3.1.5
   3.4.0
   3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> EC: fix NPE caused by StripedWriter.clearBuffers during reconstruct block
> -
>
> Key: HDFS-15779
> URL: https://issues.apache.org/jira/browse/HDFS-15779
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.2.0
>Reporter: Hongbing Wang
>Assignee: Hongbing Wang
>Priority: Major
> Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3
>
> Attachments: HDFS-15779.001.patch, HDFS-15779.002.patch
>
>
> The NullPointerException in DN log as follows: 
> {code:java}
> 2020-12-28 15:49:25,453 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> DatanodeCommand action: DNA_ERASURE_CODING_RECOVERY
> //...
> 2020-12-28 15:51:25,551 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Connection timed out
> 2020-12-28 15:51:25,553 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Failed to reconstruct striped block: 
> BP-1922004198-10.83.xx.xx-1515033360950:blk_-9223372036804064064_6311920695
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedWriter.clearBuffers(StripedWriter.java:299)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.clearBuffers(StripedBlockReconstructor.java:139)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.reconstruct(StripedBlockReconstructor.java:115)
> at 
> org.apache.hadoop.hdfs.server.datanode.erasurecode.StripedBlockReconstructor.run(StripedBlockReconstructor.java:60)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> 2020-12-28 15:51:25,749 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Receiving 
> BP-1922004198-10.83.xx.xx-1515033360950:blk_-9223372036799445643_6313197139 
> src: /10.83.xxx.52:53198 dest: /10.83.xxx.52:50
> 010
> {code}
> NPE occurs at `writer.getTargetBuffer()` in codes:
> {code:java}
> // StripedWriter#clearBuffers
> void clearBuffers() {
>   for (StripedBlockWriter writer : writers) {
> ByteBuffer targetBuffer = writer.getTargetBuffer();
> if (targetBuffer != null) {
>   targetBuffer.clear();
> }
>   }
> }
> {code}
> So, why is the writer null? Let's track when the writer is initialized and 
> when reconstruct() is called,  as follows:
> {code:java}
> // StripedBlockReconstructor#run
> public void run() {
>   try {
> initDecoderIfNecessary();
> getStripedReader().init();
> stripedWriter.init();  //①
> reconstruct();  //②
> stripedWriter.endTargetBlocks();
>   } catch (Throwable e) {
> LOG.warn("Failed to reconstruct striped block: {}", getBlockGroup(), e);
> // ...{code}
> They are called at ① and ② above respectively. `stripedWriter.init()` -> 
> `initTargetStreams()`, as follows:
> {code:java}
> // StripedWriter#initTargetStreams
> int initTargetStreams() {
>   int nSuccess = 0;
>   for (short i = 0; i < targets.length; i++) {
> try {
>   writers[i] = createWriter(i);
>   nSuccess++;
>   targetsStatus[i] = true;
> } catch (Throwable e) {
>   LOG.warn(e.getMessage());
> }
>   }
>   return nSuccess;
> }
> {code}
> NPE occurs when createWriter() gets an exception and  0 < nSuccess < 
> targets.length. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15816) If NO stale node in last choosing, the chooseTarget don't need to retry with stale nodes.

2021-02-03 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15816:

Attachment: (was: HDFS-15816.001.patch)

> If NO stale node in last choosing, the chooseTarget don't need to retry with 
> stale nodes.
> -
>
> Key: HDFS-15816
> URL: https://issues.apache.org/jira/browse/HDFS-15816
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: block placement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15816.001.patch
>
>
> If NO stale node in last choosing, the chooseTarget don't need to retry with 
> stale nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15816) If NO stale node in last choosing, the chooseTarget don't need to retry with stale nodes.

2021-02-03 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15816:

Attachment: HDFS-15816.001.patch
Status: Patch Available  (was: Open)

> If NO stale node in last choosing, the chooseTarget don't need to retry with 
> stale nodes.
> -
>
> Key: HDFS-15816
> URL: https://issues.apache.org/jira/browse/HDFS-15816
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: block placement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15816.001.patch, HDFS-15816.001.patch
>
>
> If NO stale node in last choosing, the chooseTarget don't need to retry with 
> stale nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15818) Fix TestFsDatasetImpl.testReadLockCanBeDisabledByConfig

2021-02-03 Thread Leon Gao (Jira)
Leon Gao created HDFS-15818:
---

 Summary: Fix TestFsDatasetImpl.testReadLockCanBeDisabledByConfig
 Key: HDFS-15818
 URL: https://issues.apache.org/jira/browse/HDFS-15818
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Leon Gao
Assignee: Leon Gao


Current TestFsDatasetImpl.testReadLockCanBeDisabledByConfig is incorrect:

1) Test fails intermittently as holder can acquire lock first

[https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2666/1/testReport/]

 

2) Test passes regardless of the setting of 

DFS_DATANODE_LOCK_READ_WRITE_ENABLED_KEY



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-03 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278406#comment-17278406
 ] 

Jim Brennan commented on HDFS-15813:


[~kihwal] thanks for the detailed test suggestion.  I have added a test in 
patch 003.  I've verified that this test fails in trunk and passes with this 
change.


> DataStreamer: keep sending heartbeat packets while streaming
> 
>
> Key: HDFS-15813
> URL: https://issues.apache.org/jira/browse/HDFS-15813
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch, 
> HDFS-15813.003.patch
>
>
> In response to [HDFS-5032], [~daryn] made a change to our internal code to 
> ensure that heartbeats continue during data steaming, even in the face of a 
> slow disk.
> As [~kihwal] noted, absence of heartbeat during flush will be fixed in a 
> separate jira.  It doesn't look like this change was ever pushed back to 
> apache, so I am providing it here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-03 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan updated HDFS-15813:
---
Attachment: HDFS-15813.003.patch

> DataStreamer: keep sending heartbeat packets while streaming
> 
>
> Key: HDFS-15813
> URL: https://issues.apache.org/jira/browse/HDFS-15813
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch, 
> HDFS-15813.003.patch
>
>
> In response to [HDFS-5032], [~daryn] made a change to our internal code to 
> ensure that heartbeats continue during data steaming, even in the face of a 
> slow disk.
> As [~kihwal] noted, absence of heartbeat during flush will be fixed in a 
> separate jira.  It doesn't look like this change was ever pushed back to 
> apache, so I am providing it here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15790) Make ProtobufRpcEngineProtos and ProtobufRpcEngineProtos2 Co-Exist

2021-02-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15790?focusedWorklogId=547301=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547301
 ]

ASF GitHub Bot logged work on HDFS-15790:
-

Author: ASF GitHub Bot
Created on: 03/Feb/21 21:53
Start Date: 03/Feb/21 21:53
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2650:
URL: https://github.com/apache/hadoop/pull/2650#issuecomment-772849202


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 32s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  buf  |   0m  0s |  |  buf was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 30s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  22m  1s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 59s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |  19m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  checkstyle  |   4m 15s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   6m  5s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 26s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   4m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   6m 26s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +0 :ok: |  spotbugs  |   1m 25s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |  11m 27s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 56s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | -1 :x: |  cc  |  19m 56s | 
[/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/4/artifact/out/diff-compile-cc-root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04.txt)
 |  root-jdkUbuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 generated 37 new + 375 unchanged - 37 
fixed = 412 total (was 412)  |
   | +1 :green_heart: |  javac  |  19m 56s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  17m 55s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | -1 :x: |  cc  |  17m 55s | 
[/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/4/artifact/out/diff-compile-cc-root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01.txt)
 |  root-jdkPrivateBuild-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 generated 44 new + 368 
unchanged - 44 fixed = 412 total (was 412)  |
   | +1 :green_heart: |  javac  |  17m 55s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   4m  1s | 
[/diff-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2650/4/artifact/out/diff-checkstyle-root.txt)
 |  root: The patch generated 4 new + 557 unchanged - 3 fixed = 561 total (was 
560)  |
   | +1 :green_heart: |  mvnsite  |   6m  8s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  13m  6s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   4m 44s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   6m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  findbugs  |  12m 11s |  |  the patch passed  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 15s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   2m 39s |  |  hadoop-hdfs-client in the patch 
passed.  |
   | -1 :x: |  unit  | 178m 12s | 

[jira] [Work logged] (HDFS-15817) Rename snapshots while marking them deleted

2021-02-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15817?focusedWorklogId=547207=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547207
 ]

ASF GitHub Bot logged work on HDFS-15817:
-

Author: ASF GitHub Bot
Created on: 03/Feb/21 18:27
Start Date: 03/Feb/21 18:27
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #2677:
URL: https://github.com/apache/hadoop/pull/2677#issuecomment-772722672


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 36s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |   |   0m  0s | [test4tests](test4tests) |  The patch 
appears to include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  33m 24s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  compile  |   1m 16s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  checkstyle  |   0m 58s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 22s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  15m 13s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 53s |  |  trunk passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +0 :ok: |  spotbugs  |   3m 13s |  |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   3m 11s |  |  trunk passed  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 12s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 17s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javac  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  javac  |   1m 10s |  |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 57s | 
[/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2677/1/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs-project/hadoop-hdfs: The patch generated 6 new + 28 unchanged - 
0 fixed = 34 total (was 28)  |
   | +1 :green_heart: |  mvnsite  |   1m 17s |  |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  |  The patch has no 
whitespace issues.  |
   | +1 :green_heart: |  shadedclient  |  12m 43s |  |  patch has no errors 
when building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  the patch passed with JDK 
Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  the patch passed with JDK 
Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01  |
   | +1 :green_heart: |  findbugs  |   3m 16s |  |  the patch passed  |
    _ Other Tests _ |
   | -1 :x: |  unit  | 192m 54s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2677/1/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 43s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 279m 18s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdfs.server.blockmanagement.TestUnderReplicatedBlocks |
   |   | hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2677/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2677 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle |
   | uname | Linux 7544dcf31cf5 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 66ecee333e0 |
   | Default Java | Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01 |
   | Multi-JDK 

[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-03 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278239#comment-17278239
 ] 

Jim Brennan commented on HDFS-15813:


Thanks [~kihwal].  I will try this.


> DataStreamer: keep sending heartbeat packets while streaming
> 
>
> Key: HDFS-15813
> URL: https://issues.apache.org/jira/browse/HDFS-15813
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch
>
>
> In response to [HDFS-5032], [~daryn] made a change to our internal code to 
> ensure that heartbeats continue during data steaming, even in the face of a 
> slow disk.
> As [~kihwal] noted, absence of heartbeat during flush will be fixed in a 
> separate jira.  It doesn't look like this change was ever pushed back to 
> apache, so I am providing it here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-03 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278224#comment-17278224
 ] 

Kihwal Lee edited comment on HDFS-15813 at 2/3/21, 5:19 PM:


As for the safety of the change, I feel confident since this has been running 
in our production for many years. But for preventing future breakages, we might 
need a unit test.

It might be possible to simulate the condition by delaying the ACK for last 
packet using {{DataNodeFaultInjector}}. If we do 1) write data 2) hflush, 3) 
enable the fault injector and 3) close(), the existing 
{{delaySendingAckToUpstream()}} may be utilized.  Just one datanode is needed. 
Even if the ACK is delayed beyond the value of "dfs.client.socket-timeout", the 
pipeline should not break. Because of the heartbeat packets, the BlockReceiver 
in the datanode won't get a read timeout while the client is waiting for the 
last ack.


was (Author: kihwal):
As for the safety of the change, I feel confident since this has been running 
in our production for many years. But for preventing future breakages, we might 
need a unit test.

It might be possible to simulate the condition by delaying the ACK for last 
packet using {{DataNodeFaultInjector}}. If we do 1) write data 2) hflush, 3) 
enable the fault injector and 3) close(), the existing 
{{delaySendingAckToUpstream()}} may be utilized.  Just one datanode is needed. 
Even if the ACK is delayed beyond the value of 
"ipc.client.connection.maxidletime", the pipeline should not break. Because of 
the heartbeat packets, the BlockReceiver in the datanode won't get a read 
timeout while the client is waiting for the last ack.

> DataStreamer: keep sending heartbeat packets while streaming
> 
>
> Key: HDFS-15813
> URL: https://issues.apache.org/jira/browse/HDFS-15813
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch
>
>
> In response to [HDFS-5032], [~daryn] made a change to our internal code to 
> ensure that heartbeats continue during data steaming, even in the face of a 
> slow disk.
> As [~kihwal] noted, absence of heartbeat during flush will be fixed in a 
> separate jira.  It doesn't look like this change was ever pushed back to 
> apache, so I am providing it here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12763) DataStreamer should heartbeat during flush

2021-02-03 Thread Kihwal Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-12763:
--
Resolution: Duplicate
Status: Resolved  (was: Patch Available)

> DataStreamer should heartbeat during flush
> --
>
> Key: HDFS-12763
> URL: https://issues.apache.org/jira/browse/HDFS-12763
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.8.1
>Reporter: Kuhu Shukla
>Assignee: Kuhu Shukla
>Priority: Major
> Attachments: HDFS-12763.001.patch
>
>
> From HDFS-5032:
> bq. Absence of heartbeat during flush will be fixed in a separate jira by 
> Daryn Sharp
> This JIRA tracks the case where absence of heartbeat can cause the pipeline 
> to fail if operations like flush take some time to complete.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-03 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278224#comment-17278224
 ] 

Kihwal Lee commented on HDFS-15813:
---

As for the safety of the change, I feel confident since this has been running 
in our production for many years. But for preventing future breakages, we might 
need a unit test.

It might be possible to simulate the condition by delaying the ACK for last 
packet using {{DataNodeFaultInjector}}. If we do 1) write data 2) hflush, 3) 
enable the fault injector and 3) close(), the existing 
{{delaySendingAckToUpstream()}} may be utilized.  Just one datanode is needed. 
Even if the ACK is delayed beyond the value of 
"ipc.client.connection.maxidletime", the pipeline should not break. Because of 
the heartbeat packets, the BlockReceiver in the datanode won't get a read 
timeout while the client is waiting for the last ack.

> DataStreamer: keep sending heartbeat packets while streaming
> 
>
> Key: HDFS-15813
> URL: https://issues.apache.org/jira/browse/HDFS-15813
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch
>
>
> In response to [HDFS-5032], [~daryn] made a change to our internal code to 
> ensure that heartbeats continue during data steaming, even in the face of a 
> slow disk.
> As [~kihwal] noted, absence of heartbeat during flush will be fixed in a 
> separate jira.  It doesn't look like this change was ever pushed back to 
> apache, so I am providing it here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-03 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278185#comment-17278185
 ] 

Jim Brennan commented on HDFS-15813:


I searched again and found this had been put up in [HDFS-12763] by [~kshukla] 
back in 2017.

[~kihwal], [~daryn] given that we have been running with this in production for 
5 years, I am not sure it is worth the effort to build a unit test for this.   
Let me know what you think.

My inclination is to stick with this jira, since the patch is already up to 
date with trunk, and close HDFS-12763 as a duplicate.  If you'd prefer I move 
the patch to HDFS-12763 and mark this one as a duplicate, I can do that instead.


> DataStreamer: keep sending heartbeat packets while streaming
> 
>
> Key: HDFS-15813
> URL: https://issues.apache.org/jira/browse/HDFS-15813
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch
>
>
> In response to [HDFS-5032], [~daryn] made a change to our internal code to 
> ensure that heartbeats continue during data steaming, even in the face of a 
> slow disk.
> As [~kihwal] noted, absence of heartbeat during flush will be fixed in a 
> separate jira.  It doesn't look like this change was ever pushed back to 
> apache, so I am providing it here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13609) [Edit Tail Fast Path Pt 3] NameNode-side changes to support tailing edits via RPC

2021-02-03 Thread Erik Krogen (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-13609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278150#comment-17278150
 ] 

Erik Krogen commented on HDFS-13609:


Thanks for the detailed explanation and example [~xuzq_zander]! I see now the 
issue. I believe the logic in the snippet you shared is doing the right thing 
-- and is necessary for correctness. I guess there are two issues being 
discussed here:

# Why JN1 is lagging: You're saying this is happening because JN1 wrote some 
txns to its cache, but not onto disk. Can you elaborate on why this causes it 
to lag? It's been a long time since I looked at this code. Regardless, I agree 
that this is definitely a bug, and we should be doing whatever is necessary to 
keep the cache reflective of what eventually got written to disk. I don't know 
if the right approach is to write to the cache after the disk (this may cause 
performance issues?) or to invalidate the cache if the disk write fails.
# Why JN1 lagging is causing broader issues: We use 
{{loggers.waitForWriteQuorum}} which returns as soon as it gets a quorum of 
responses, but in some cases we actually want to wait a bit longer but get more 
responses, for example in the case you described where JN1 keeps responding 
with a low txid. I actually have some memory of discussing this back in the 
implementation days with [~shv] and [~vagarychen] but don't remember the 
conclusion -- I think we were waiting to see if this became an issue in 
practice.

It sounds like (1) is a legitimate bug, and (2) is kind of a bug and kind of a 
performance/reliability enhancement. Unfortunately I'm no longer working in 
this area so I can't spend much time beyond providing high-level input, but 
these sound both like good areas for improvement. Perhaps [~shv] can provide 
input on whether we're seeing similar issues on our end and whether or not he 
remembers any discussions on these matters.

> [Edit Tail Fast Path Pt 3] NameNode-side changes to support tailing edits via 
> RPC
> -
>
> Key: HDFS-13609
> URL: https://issues.apache.org/jira/browse/HDFS-13609
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: HDFS-12943, 3.3.0
>
> Attachments: HDFS-13609-HDFS-12943.000.patch, 
> HDFS-13609-HDFS-12943.001.patch, HDFS-13609-HDFS-12943.002.patch, 
> HDFS-13609-HDFS-12943.003.patch, HDFS-13609-HDFS-12943.004.patch
>
>
> See HDFS-13150 for the full design.
> This JIRA is targetted at the NameNode-side changes to enable tailing 
> in-progress edits via the RPC mechanism added in HDFS-13608. Most changes are 
> in the QuorumJournalManager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15792) ClasscastException while loading FSImage

2021-02-03 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278147#comment-17278147
 ] 

Renukaprasad C commented on HDFS-15792:
---

[~hexiaoqiao] Build failed again, is it due to the patch? How to see the build 
reports?

> ClasscastException while loading FSImage
> 
>
> Key: HDFS-15792
> URL: https://issues.apache.org/jira/browse/HDFS-15792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15792-branch-2.10.001.patch, 
> HDFS-15792-branch-2.10.002.patch, HDFS-15792.001.patch, HDFS-15792.002.patch, 
> HDFS-15792.003.patch, HDFS-15792.004.patch, HDFS-15792.005.patch, 
> HDFS-15792.addendum.001.patch, image-2021-01-27-12-00-34-846.png
>
>
> FSImage loading has failed with ClasscastException - 
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to 
> java.util.HashMap$TreeNode.
> This is the usage issue with Hashmap in concurrent scenarios.
> Same issue has been reported on Java & closed as usage issue.  - 
> https://bugs.openjdk.java.net/browse/JDK-8173671
> 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading 
> INODE from fsiamge. | FSImageFormatProtobuf.java:442
> java.lang.
> : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
>   at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
>   at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
>   at java.util.HashMap.treeifyBin(HashMap.java:772)
>   at java.util.HashMap.putVal(HashMap.java:644)
>   at java.util.HashMap.put(HashMap.java:612)
>   at 
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from 
> FSImageFile(file=/srv/BigData/namenode/current/fsimage_00198227480, 
> cpktTxId=00198227480) | FSImage.java:738
> java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node 
> cannot be cast to java.util.HashMap$TreeNode
>   at 
> org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1113)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:730)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:648)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:710)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:953)

[jira] [Commented] (HDFS-15792) ClasscastException while loading FSImage

2021-02-03 Thread Renukaprasad C (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278139#comment-17278139
 ] 

Renukaprasad C commented on HDFS-15792:
---

Thanks [~hexiaoqiao]
I have prepared patch for branch 2.10, i can compile locally on JDK 8 with 
target version 7 & run UT.

Some part of the code has compatibility issue (when compile in JDK 8 with 
java.version=1.7 in POM), to avoid the issue copied into Hashmap. This code has 
limited usage in test, so shouldnt have any problem.

  @VisibleForTesting
  public ImmutableList getEntries() {
return new ImmutableList.Builder().addAll(referenceMap.keySet()).build();
  }

Also, 
In AclFeature replaced AtomicInt with basic int type as before and synchronized 
the concurrent access of the value.

> ClasscastException while loading FSImage
> 
>
> Key: HDFS-15792
> URL: https://issues.apache.org/jira/browse/HDFS-15792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15792-branch-2.10.001.patch, 
> HDFS-15792-branch-2.10.002.patch, HDFS-15792.001.patch, HDFS-15792.002.patch, 
> HDFS-15792.003.patch, HDFS-15792.004.patch, HDFS-15792.005.patch, 
> HDFS-15792.addendum.001.patch, image-2021-01-27-12-00-34-846.png
>
>
> FSImage loading has failed with ClasscastException - 
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to 
> java.util.HashMap$TreeNode.
> This is the usage issue with Hashmap in concurrent scenarios.
> Same issue has been reported on Java & closed as usage issue.  - 
> https://bugs.openjdk.java.net/browse/JDK-8173671
> 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading 
> INODE from fsiamge. | FSImageFormatProtobuf.java:442
> java.lang.
> : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
>   at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
>   at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
>   at java.util.HashMap.treeifyBin(HashMap.java:772)
>   at java.util.HashMap.putVal(HashMap.java:644)
>   at java.util.HashMap.put(HashMap.java:612)
>   at 
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from 
> FSImageFile(file=/srv/BigData/namenode/current/fsimage_00198227480, 
> cpktTxId=00198227480) | FSImage.java:738
> java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node 
> cannot be cast to java.util.HashMap$TreeNode
>   at 
> org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733)
>   at 
> 

[jira] [Commented] (HDFS-15792) ClasscastException while loading FSImage

2021-02-03 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278140#comment-17278140
 ] 

Hadoop QA commented on HDFS-15792:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red}  0m 
56s{color} | {color:red}{color} | {color:red} Docker failed to build 
yetus/hadoop:7257b17793d. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-15792 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13019934/HDFS-15792-branch-2.10.002.patch
 |
| Console output | 
https://ci-hadoop.apache.org/job/PreCommit-HDFS-Build/456/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |


This message was automatically generated.



> ClasscastException while loading FSImage
> 
>
> Key: HDFS-15792
> URL: https://issues.apache.org/jira/browse/HDFS-15792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15792-branch-2.10.001.patch, 
> HDFS-15792-branch-2.10.002.patch, HDFS-15792.001.patch, HDFS-15792.002.patch, 
> HDFS-15792.003.patch, HDFS-15792.004.patch, HDFS-15792.005.patch, 
> HDFS-15792.addendum.001.patch, image-2021-01-27-12-00-34-846.png
>
>
> FSImage loading has failed with ClasscastException - 
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to 
> java.util.HashMap$TreeNode.
> This is the usage issue with Hashmap in concurrent scenarios.
> Same issue has been reported on Java & closed as usage issue.  - 
> https://bugs.openjdk.java.net/browse/JDK-8173671
> 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading 
> INODE from fsiamge. | FSImageFormatProtobuf.java:442
> java.lang.
> : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
>   at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
>   at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
>   at java.util.HashMap.treeifyBin(HashMap.java:772)
>   at java.util.HashMap.putVal(HashMap.java:644)
>   at java.util.HashMap.put(HashMap.java:612)
>   at 
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from 
> FSImageFile(file=/srv/BigData/namenode/current/fsimage_00198227480, 
> cpktTxId=00198227480) | FSImage.java:738
> java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node 
> cannot be cast to java.util.HashMap$TreeNode
>   at 
> org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955)
>   at 
> 

[jira] [Updated] (HDFS-15792) ClasscastException while loading FSImage

2021-02-03 Thread Renukaprasad C (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Renukaprasad C updated HDFS-15792:
--
Attachment: HDFS-15792-branch-2.10.002.patch

> ClasscastException while loading FSImage
> 
>
> Key: HDFS-15792
> URL: https://issues.apache.org/jira/browse/HDFS-15792
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: nn
>Reporter: Renukaprasad C
>Assignee: Renukaprasad C
>Priority: Major
> Fix For: 3.3.1, 3.4.0
>
> Attachments: HDFS-15792-branch-2.10.001.patch, 
> HDFS-15792-branch-2.10.002.patch, HDFS-15792.001.patch, HDFS-15792.002.patch, 
> HDFS-15792.003.patch, HDFS-15792.004.patch, HDFS-15792.005.patch, 
> HDFS-15792.addendum.001.patch, image-2021-01-27-12-00-34-846.png
>
>
> FSImage loading has failed with ClasscastException - 
> java.lang.ClassCastException: java.util.HashMap$Node cannot be cast to 
> java.util.HashMap$TreeNode.
> This is the usage issue with Hashmap in concurrent scenarios.
> Same issue has been reported on Java & closed as usage issue.  - 
> https://bugs.openjdk.java.net/browse/JDK-8173671
> 2020-12-28 11:36:26,127 | ERROR | main | An exception occurred when loading 
> INODE from fsiamge. | FSImageFormatProtobuf.java:442
> java.lang.
> : java.util.HashMap$Node cannot be cast to java.util.HashMap$TreeNode
>   at java.util.HashMap$TreeNode.moveRootToFront(HashMap.java:1835)
>   at java.util.HashMap$TreeNode.treeify(HashMap.java:1951)
>   at java.util.HashMap.treeifyBin(HashMap.java:772)
>   at java.util.HashMap.putVal(HashMap.java:644)
>   at java.util.HashMap.put(HashMap.java:612)
>   at 
> org.apache.hadoop.hdfs.util.ReferenceCountMap.put(ReferenceCountMap.java:53)
>   at 
> org.apache.hadoop.hdfs.server.namenode.AclStorage.addAclFeature(AclStorage.java:391)
>   at 
> org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields.addAclFeature(INodeWithAdditionalFields.java:349)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeDirectory(FSImageFormatPBINode.java:225)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:406)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.readPBINodes(FSImageFormatPBINode.java:367)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:342)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader$2.call(FSImageFormatProtobuf.java:469)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> 2020-12-28 11:36:26,130 | ERROR | main | Failed to load image from 
> FSImageFile(file=/srv/BigData/namenode/current/fsimage_00198227480, 
> cpktTxId=00198227480) | FSImage.java:738
> java.io.IOException: java.lang.ClassCastException: java.util.HashMap$Node 
> cannot be cast to java.util.HashMap$TreeNode
>   at 
> org.apache.hadoop.io.MultipleIOException$Builder.add(MultipleIOException.java:68)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.runLoaderTasks(FSImageFormatProtobuf.java:444)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:360)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:263)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:227)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:971)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:955)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImageFile(FSImage.java:820)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:733)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:331)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1113)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:730)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:648)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:710)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:953)
>   at 
> 

[jira] [Commented] (HDFS-15798) EC: Reconstruct task failed, and It would be XmitsInProgress of DN has negative number

2021-02-03 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278113#comment-17278113
 ] 

Stephen O'Donnell commented on HDFS-15798:
--

+1 on HDFS-15798.003.patch from me too. I guess we should commit this down to 
3.1 if the issue is present since then.

I will try to commit it later (probably tomorrow at this stage today).

> EC: Reconstruct task failed, and It would be XmitsInProgress of DN has 
> negative number
> --
>
> Key: HDFS-15798
> URL: https://issues.apache.org/jira/browse/HDFS-15798
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: huhaiyang
>Assignee: huhaiyang
>Priority: Major
> Attachments: HDFS-15798.001.patch, HDFS-15798.002.patch, 
> HDFS-15798.003.patch
>
>
> The EC reconstruct task failed, and the decrementXmitsInProgress of 
> processErasureCodingTasks operation abnormal value ;
>  It would be XmitsInProgress of DN has negative number, it affects NN chooses 
> pending tasks based on the ratio between the lengths of replication and 
> erasure-coded block queues.
> {code:java}
> // 1.ErasureCodingWorker.java
> public void processErasureCodingTasks(
> Collection ecTasks) {
>   for (BlockECReconstructionInfo reconInfo : ecTasks) {
> int xmitsSubmitted = 0;
> try {
>   ...
>   // It may throw IllegalArgumentException from task#stripedReader
>   // constructor.
>   final StripedBlockReconstructor task =
>   new StripedBlockReconstructor(this, stripedReconInfo);
>   if (task.hasValidTargets()) {
> // See HDFS-12044. We increase xmitsInProgress even the task is only
> // enqueued, so that
> //   1) NN will not send more tasks than what DN can execute and
> //   2) DN will not throw away reconstruction tasks, and instead keeps
> //  an unbounded number of tasks in the executor's task queue.
> xmitsSubmitted = Math.max((int)(task.getXmits() * xmitWeight), 1);
> getDatanode().incrementXmitsInProcess(xmitsSubmitted); //  task start 
> increment
> stripedReconstructionPool.submit(task);
>   } else {
> LOG.warn("No missing internal block. Skip reconstruction for task:{}",
> reconInfo);
>   }
> } catch (Throwable e) {
>   getDatanode().decrementXmitsInProgress(xmitsSubmitted); //  task failed 
> decrement,  XmitsInProgress is decremented by the previous value
>   LOG.warn("Failed to reconstruct striped block {}",
>   reconInfo.getExtendedBlock().getLocalBlock(), e);
> }
>   }
> }
> // 2.StripedBlockReconstructor.java
> public void run() {
>   try {
> initDecoderIfNecessary();
>...
>   } catch (Throwable e) {
> LOG.warn("Failed to reconstruct striped block: {}", getBlockGroup(), e);
> getDatanode().getMetrics().incrECFailedReconstructionTasks();
>   } finally {
> float xmitWeight = getErasureCodingWorker().getXmitWeight();
> // if the xmits is smaller than 1, the xmitsSubmitted should be set to 1
> // because if it set to zero, we cannot to measure the xmits submitted
> int xmitsSubmitted = Math.max((int) (getXmits() * xmitWeight), 1);
> getDatanode().decrementXmitsInProgress(xmitsSubmitted); // task complete 
> decrement
> ...
>   }
> }{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15813) DataStreamer: keep sending heartbeat packets while streaming

2021-02-03 Thread Jim Brennan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278054#comment-17278054
 ] 

Jim Brennan commented on HDFS-15813:


[~kihwal] do you think we need a unit test for this?  

> DataStreamer: keep sending heartbeat packets while streaming
> 
>
> Key: HDFS-15813
> URL: https://issues.apache.org/jira/browse/HDFS-15813
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 3.4.0
>Reporter: Jim Brennan
>Assignee: Jim Brennan
>Priority: Major
> Attachments: HDFS-15813.001.patch, HDFS-15813.002.patch
>
>
> In response to [HDFS-5032], [~daryn] made a change to our internal code to 
> ensure that heartbeats continue during data steaming, even in the face of a 
> slow disk.
> As [~kihwal] noted, absence of heartbeat during flush will be fixed in a 
> separate jira.  It doesn't look like this change was ever pushed back to 
> apache, so I am providing it here.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15799) Make DisallowedDatanodeException terse

2021-02-03 Thread Kihwal Lee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-15799:
--
Fix Version/s: 3.2.3
   3.4.0
   3.3.1
 Hadoop Flags: Reviewed
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

I've committed this to trunk, branch-3.3 and branch-3.2. Thanks for working on 
this, [~richard-ross].

> Make DisallowedDatanodeException terse
> --
>
> Key: HDFS-15799
> URL: https://issues.apache.org/jira/browse/HDFS-15799
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Richard
>Assignee: Richard
>Priority: Minor
> Fix For: 3.3.1, 3.4.0, 3.2.3
>
> Attachments: HDFS-15799.001.patch
>
>
> When org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException is 
> thrown back to a datanode, the namenode logs a full stack trace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15799) Make DisallowedDatanodeException terse

2021-02-03 Thread Kihwal Lee (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17278040#comment-17278040
 ] 

Kihwal Lee commented on HDFS-15799:
---

+1 lgtm. The failed tests don't seem related to the patch.

> Make DisallowedDatanodeException terse
> --
>
> Key: HDFS-15799
> URL: https://issues.apache.org/jira/browse/HDFS-15799
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Affects Versions: 2.10.1, 3.4.0
>Reporter: Richard
>Assignee: Richard
>Priority: Minor
> Attachments: HDFS-15799.001.patch
>
>
> When org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException is 
> thrown back to a datanode, the namenode logs a full stack trace.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDFS-15817) Rename snapshots while marking them deleted

2021-02-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15817?focusedWorklogId=547010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-547010
 ]

ASF GitHub Bot logged work on HDFS-15817:
-

Author: ASF GitHub Bot
Created on: 03/Feb/21 13:46
Start Date: 03/Feb/21 13:46
Worklog Time Spent: 10m 
  Work Description: bshashikant opened a new pull request #2677:
URL: https://github.com/apache/hadoop/pull/2677


   Please see https://issues.apache.org/jira/browse/HDFS-15817 for details.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 547010)
Remaining Estimate: 0h
Time Spent: 10m

> Rename snapshots while marking them deleted 
> 
>
> Key: HDFS-15817
> URL: https://issues.apache.org/jira/browse/HDFS-15817
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> With ordered snapshot feature turned on, a snapshot will be just marked as 
> deleted but won't actually be deleted if its not the oldest one. Since, the 
> snapshot is just marked deleted, creation of  new snapshot having the same 
> name as the one which was marked deleted will fail. In order to mitigate such 
> problems, the idea here is to rename the snapshot getting marked as deleted 
> by appending deletion timestamp along with snapshot id to it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15817) Rename snapshots while marking them deleted

2021-02-03 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDFS-15817:
--
Labels: pull-request-available  (was: )

> Rename snapshots while marking them deleted 
> 
>
> Key: HDFS-15817
> URL: https://issues.apache.org/jira/browse/HDFS-15817
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> With ordered snapshot feature turned on, a snapshot will be just marked as 
> deleted but won't actually be deleted if its not the oldest one. Since, the 
> snapshot is just marked deleted, creation of  new snapshot having the same 
> name as the one which was marked deleted will fail. In order to mitigate such 
> problems, the idea here is to rename the snapshot getting marked as deleted 
> by appending deletion timestamp along with snapshot id to it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15803) EC: Remove unnecessary method (getWeight) in StripedReconstructionInfo

2021-02-03 Thread Hui Fei (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17277947#comment-17277947
 ] 

Hui Fei commented on HDFS-15803:


[~haiyang Hu] [~sodonnell] Thanks again, committed to trunk!

> EC: Remove unnecessary method (getWeight) in StripedReconstructionInfo 
> ---
>
> Key: HDFS-15803
> URL: https://issues.apache.org/jira/browse/HDFS-15803
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: huhaiyang
>Assignee: huhaiyang
>Priority: Trivial
> Fix For: 3.4.0
>
> Attachments: HDFS-15803_001.patch
>
>
>  Removing the unused method from StripedReconstructionInfo
> {code:java}
> // StripedReconstructionInfo.java
> /**
>  * Return the weight of this EC reconstruction task.
>  *
>  * DN uses it to coordinate with NN to adjust the speed of scheduling the
>  * reconstructions tasks to this DN.
>  *
>  * @return the weight of this reconstruction task.
>  * @see HDFS-12044
>  */
> int getWeight() {
>   // See HDFS-12044. The weight of a RS(n, k) is calculated by the network
>   // connections it opens.
>   return sources.length + targets.length;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15803) EC: Remove unnecessary method (getWeight) in StripedReconstructionInfo

2021-02-03 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei updated HDFS-15803:
---
Fix Version/s: 3.4.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> EC: Remove unnecessary method (getWeight) in StripedReconstructionInfo 
> ---
>
> Key: HDFS-15803
> URL: https://issues.apache.org/jira/browse/HDFS-15803
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: huhaiyang
>Assignee: huhaiyang
>Priority: Trivial
> Fix For: 3.4.0
>
> Attachments: HDFS-15803_001.patch
>
>
>  Removing the unused method from StripedReconstructionInfo
> {code:java}
> // StripedReconstructionInfo.java
> /**
>  * Return the weight of this EC reconstruction task.
>  *
>  * DN uses it to coordinate with NN to adjust the speed of scheduling the
>  * reconstructions tasks to this DN.
>  *
>  * @return the weight of this reconstruction task.
>  * @see HDFS-12044
>  */
> int getWeight() {
>   // See HDFS-12044. The weight of a RS(n, k) is calculated by the network
>   // connections it opens.
>   return sources.length + targets.length;
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15817) Rename snapshots while marking them deleted

2021-02-03 Thread Shashikant Banerjee (Jira)
Shashikant Banerjee created HDFS-15817:
--

 Summary: Rename snapshots while marking them deleted 
 Key: HDFS-15817
 URL: https://issues.apache.org/jira/browse/HDFS-15817
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Shashikant Banerjee


With ordered snapshot feature turned on, a snapshot will be just marked as 
deleted but won't actually be deleted if its not the oldest one. Since, the 
snapshot is just marked deleted, creation of  new snapshot having the same name 
as the one which was marked deleted will fail. In order to mitigate such 
problems, the idea here is to rename the snapshot getting marked as deleted by 
appending deletion timestamp along with snapshot id to it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15817) Rename snapshots while marking them deleted

2021-02-03 Thread Shashikant Banerjee (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shashikant Banerjee reassigned HDFS-15817:
--

Assignee: Shashikant Banerjee

> Rename snapshots while marking them deleted 
> 
>
> Key: HDFS-15817
> URL: https://issues.apache.org/jira/browse/HDFS-15817
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
>
> With ordered snapshot feature turned on, a snapshot will be just marked as 
> deleted but won't actually be deleted if its not the oldest one. Since, the 
> snapshot is just marked deleted, creation of  new snapshot having the same 
> name as the one which was marked deleted will fail. In order to mitigate such 
> problems, the idea here is to rename the snapshot getting marked as deleted 
> by appending deletion timestamp along with snapshot id to it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15815) if required storageType are unavailable, log the failed reason during choosing Datanode

2021-02-03 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15815:

Summary:  if required storageType are unavailable, log the failed reason 
during choosing Datanode  (was:  if required storageType are unavailable log 
the failed reason during choosing Datanode)

>  if required storageType are unavailable, log the failed reason during 
> choosing Datanode
> 
>
> Key: HDFS-15815
> URL: https://issues.apache.org/jira/browse/HDFS-15815
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: block placement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15815.001.patch
>
>
> For better debug,  if required storageType are unavailable, log the failed 
> reason "NO_REQUIRED_STORAGE_TYPE" when choosing Datanode.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15816) If NO stale node in last choosing, the chooseTarget don't need to retry with stale nodes.

2021-02-03 Thread Yang Yun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yun updated HDFS-15816:

Attachment: HDFS-15816.001.patch

> If NO stale node in last choosing, the chooseTarget don't need to retry with 
> stale nodes.
> -
>
> Key: HDFS-15816
> URL: https://issues.apache.org/jira/browse/HDFS-15816
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: block placement
>Reporter: Yang Yun
>Assignee: Yang Yun
>Priority: Minor
> Attachments: HDFS-15816.001.patch
>
>
> If NO stale node in last choosing, the chooseTarget don't need to retry with 
> stale nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15816) If NO stale node in last choosing, the chooseTarget don't need to retry with stale nodes.

2021-02-03 Thread Yang Yun (Jira)
Yang Yun created HDFS-15816:
---

 Summary: If NO stale node in last choosing, the chooseTarget don't 
need to retry with stale nodes.
 Key: HDFS-15816
 URL: https://issues.apache.org/jira/browse/HDFS-15816
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: block placement
Reporter: Yang Yun
Assignee: Yang Yun


If NO stale node in last choosing, the chooseTarget don't need to retry with 
stale nodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org