[jira] [Commented] (HDDS-1766) ContainerStateMachine is unable to increment lastAppliedTermIndex
[ https://issues.apache.org/jira/browse/HDDS-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884565#comment-16884565 ] Hudson commented on HDDS-1766: -- FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #16910 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/16910/]) HDDS-1766. ContainerStateMachine is unable to increment (github: rev 0976f6fc30ed8bb774d823f09c58cea54be05ae7) * (edit) hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java > ContainerStateMachine is unable to increment lastAppliedTermIndex > - > > Key: HDDS-1766 > URL: https://issues.apache.org/jira/browse/HDDS-1766 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Mukul Kumar Singh >Priority: Blocker > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h > Remaining Estimate: 0h > > ContainerStateMachine#updateLastApplied currently updates the > lastAppliedTermIndex using applyTransactionCompletionMap. There are null > entries in the applyTransactionCompletionMap causing the lastAppliedIndex to > not be incremented. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1766) ContainerStateMachine is unable to increment lastAppliedTermIndex
[ https://issues.apache.org/jira/browse/HDDS-1766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-1766: Resolution: Fixed Fix Version/s: 0.5.0 Status: Resolved (was: Patch Available) Thanks for the review [~ljain] & [~shashikant]. I have committed this to trunk. > ContainerStateMachine is unable to increment lastAppliedTermIndex > - > > Key: HDDS-1766 > URL: https://issues.apache.org/jira/browse/HDDS-1766 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Mukul Kumar Singh >Priority: Blocker > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h > Remaining Estimate: 0h > > ContainerStateMachine#updateLastApplied currently updates the > lastAppliedTermIndex using applyTransactionCompletionMap. There are null > entries in the applyTransactionCompletionMap causing the lastAppliedIndex to > not be incremented. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1766) ContainerStateMachine is unable to increment lastAppliedTermIndex
[ https://issues.apache.org/jira/browse/HDDS-1766?focusedWorklogId=276346=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-276346 ] ASF GitHub Bot logged work on HDDS-1766: Author: ASF GitHub Bot Created on: 14/Jul/19 05:23 Start Date: 14/Jul/19 05:23 Worklog Time Spent: 10m Work Description: mukul1987 commented on pull request #1072: HDDS-1766. ContainerStateMachine is unable to increment lastAppliedTermIndex. Contributed by Mukul Kumar Singh. URL: https://github.com/apache/hadoop/pull/1072 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 276346) Time Spent: 1h (was: 50m) > ContainerStateMachine is unable to increment lastAppliedTermIndex > - > > Key: HDDS-1766 > URL: https://issues.apache.org/jira/browse/HDDS-1766 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Lokesh Jain >Assignee: Mukul Kumar Singh >Priority: Blocker > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > ContainerStateMachine#updateLastApplied currently updates the > lastAppliedTermIndex using applyTransactionCompletionMap. There are null > entries in the applyTransactionCompletionMap causing the lastAppliedIndex to > not be incremented. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884556#comment-16884556 ] Lisheng Sun edited comment on HDFS-13571 at 7/14/19 5:04 AM: - Thank [~linyiqun] [~jojochuang] for good suggestion. This idea that HBase read replica + HDFS hedged read is not used in XiaoMi HBase. Because the same read requires twice the bandwidth, we are worried about the impact on performance. Dead DataNode Detector summary as follow: 1. Design node state machine(see attached). When an InputStream is opened, a BlockReader is opened, and the DataNode involved in the Block is added to the Live Node list that DeadNodeDetector will periodically detect the list.If it is found to be inaccessible, put the DataNode into the Dead Node. At the same time, the InputStream itself will also access the Live Node, and if an error occurs, it will be placed in the Suspicious Node list. 2. The Data Node joining Suspicious may be a problem node, and may be access on which the block is no longer.Therefore, it needs to be confirmed by re-probing and requires a higher priority processing. Because if there is a problem with the DataNode, you need to join the Dead Node faster. 3. DeadNodeDetector will periodically detect the Node in the Dead Node list. If the access is successful, the Node will be moved to the Live Node list. Continuous detection of the dead node is necessary. The DataNode need rejoin the cluster due to a service restart/machine repair. The DataNode may be permanently excluded if there is no added probe mechanism. Patch HDFS-13571-2.6.diff is much old, and we have made a lot of updates recently. In the future, I can submit separately according to subtasks. Please [~linyiqun] [~jojochuang] help review this idea. Thank you again. was (Author: leosun08): Thank [~linyiqun] [~jojochuang] for good suggestion. This idea that HBase read replica + HDFS hedged read is not used in XiaoMi HBase. Because the same read requires twice the bandwidth, we are worried about the impact on performance. Dead DataNode Detector summary as follow: 1. Design node state machine. When an InputStream is opened, a BlockReader is opened, and the DataNode involved in the Block is added to the Live Node list that DeadNodeDetector will periodically detect the list.If it is found to be inaccessible, put the DataNode into the Dead Node. At the same time, the InputStream itself will also access the Live Node, and if an error occurs, it will be placed in the Suspicious Node list. 2. The Data Node joining Suspicious may be a problem node, and may be access on which the block is no longer.Therefore, it needs to be confirmed by re-probing and requires a higher priority processing. Because if there is a problem with the DataNode, you need to join the Dead Node faster. 3. DeadNodeDetector will periodically detect the Node in the Dead Node list. If the access is successful, the Node will be moved to the Live Node list. Continuous detection of the dead node is necessary. The DataNode need rejoin the cluster due to a service restart/machine repair. The DataNode may be permanently excluded if there is no added probe mechanism. Patch HDFS-13571-2.6.diff is much old, and we have made a lot of updates recently. In the future, I can submit separately according to subtasks. Please [~linyiqun] [~jojochuang] help review this idea. Thank you again. !屏幕快照 2019-07-14 下午12.27.22.png! > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Attachments: HDFS-13571-2.6.diff, node status machine.png > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message
[jira] [Commented] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884556#comment-16884556 ] Lisheng Sun commented on HDFS-13571: Thank [~linyiqun] [~jojochuang] for good suggestion. This idea that HBase read replica + HDFS hedged read is not used in XiaoMi HBase. Because the same read requires twice the bandwidth, we are worried about the impact on performance. Dead DataNode Detector summary as follow: 1. Design node state machine. When an InputStream is opened, a BlockReader is opened, and the DataNode involved in the Block is added to the Live Node list that DeadNodeDetector will periodically detect the list.If it is found to be inaccessible, put the DataNode into the Dead Node. At the same time, the InputStream itself will also access the Live Node, and if an error occurs, it will be placed in the Suspicious Node list. 2. The Data Node joining Suspicious may be a problem node, and may be access on which the block is no longer.Therefore, it needs to be confirmed by re-probing and requires a higher priority processing. Because if there is a problem with the DataNode, you need to join the Dead Node faster. 3. DeadNodeDetector will periodically detect the Node in the Dead Node list. If the access is successful, the Node will be moved to the Live Node list. Continuous detection of the dead node is necessary. The DataNode need rejoin the cluster due to a service restart/machine repair. The DataNode may be permanently excluded if there is no added probe mechanism. Patch HDFS-13571-2.6.diff is much old, and we have made a lot of updates recently. In the future, I can submit separately according to subtasks. Please [~linyiqun] [~jojochuang] help review this idea. Thank you again. !屏幕快照 2019-07-14 下午12.27.22.png! > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Attachments: HDFS-13571-2.6.diff, node status machine.png > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-13571: --- Attachment: node status machine.png > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Attachments: HDFS-13571-2.6.diff, node status machine.png > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-13571: --- Attachment: 屏幕快照 2019-07-14 下午12.27.22.png > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Attachments: HDFS-13571-2.6.diff > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-13571: --- Attachment: (was: 屏幕快照 2019-07-14 下午12.27.22.png) > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Attachments: HDFS-13571-2.6.diff > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14614) Add Secure Flag for DataNode Web UI Cookies
[ https://issues.apache.org/jira/browse/HDFS-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884527#comment-16884527 ] Wei-Chiu Chuang commented on HDFS-14614: Absolutely Thank you! > Add Secure Flag for DataNode Web UI Cookies > --- > > Key: HDFS-14614 > URL: https://issues.apache.org/jira/browse/HDFS-14614 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.7.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > > It looks like HDFS-7279 removed Secure Flag for DataNode Web UI. I think we > should add it back. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14614) Add Secure Flag for DataNode Web UI Cookies
[ https://issues.apache.org/jira/browse/HDFS-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang reassigned HDFS-14614: -- Assignee: Lisheng Sun (was: Wei-Chiu Chuang) > Add Secure Flag for DataNode Web UI Cookies > --- > > Key: HDFS-14614 > URL: https://issues.apache.org/jira/browse/HDFS-14614 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.7.0 >Reporter: Wei-Chiu Chuang >Assignee: Lisheng Sun >Priority: Major > > It looks like HDFS-7279 removed Secure Flag for DataNode Web UI. I think we > should add it back. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884517#comment-16884517 ] Wei-Chiu Chuang commented on HDFS-13571: I haven't reviewed the patch but this looks like a good addition. If the goal is to improve HBase availability, there was a talk given by Cloudera, Apple and Bloomberg where they talked about using HBase read replica + HDFS hedged read. Just FYI. > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Attachments: HDFS-13571-2.6.diff > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884517#comment-16884517 ] Wei-Chiu Chuang edited comment on HDFS-13571 at 7/13/19 11:01 PM: -- I haven't reviewed the patch but this looks like a good addition. If the goal is to improve HBase availability, there was a talk given by Cloudera, Apple and Bloomberg at last year's HBaseCon where they talked about using HBase read replica + HDFS hedged read. Just FYI. was (Author: jojochuang): I haven't reviewed the patch but this looks like a good addition. If the goal is to improve HBase availability, there was a talk given by Cloudera, Apple and Bloomberg where they talked about using HBase read replica + HDFS hedged read. Just FYI. > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Attachments: HDFS-13571-2.6.diff > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13507) RBF: Remove update functionality from routeradmin's add cmd
[ https://issues.apache.org/jira/browse/HDFS-13507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884452#comment-16884452 ] Gang Li commented on HDFS-13507: Sorry guys, have been busy with some projects. [~ayushtkn] I can rebase it. > RBF: Remove update functionality from routeradmin's add cmd > --- > > Key: HDFS-13507 > URL: https://issues.apache.org/jira/browse/HDFS-13507 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Wei Yan >Assignee: Gang Li >Priority: Minor > Labels: incompatible > Attachments: HDFS-13507-HDFS-13891.003.patch, > HDFS-13507-HDFS-13891.004.patch, HDFS-13507.000.patch, HDFS-13507.001.patch, > HDFS-13507.002.patch > > > Follow up the discussion in HDFS-13326. We should remove the "update" > functionality from routeradmin's add cmd, to make it consistent with RPC > calls. > Note that: this is an incompatible change. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884389#comment-16884389 ] Yiqun Lin commented on HDFS-13571: -- Hi [~leosun08], the idea makes sense to me. I did a quick through for your patch, it seems a little large. If we want to backport to the trunk, we need to separate this big patch to sub-tasks. And it will be convenient for others to review. But before creating sub-tasks for this feature, can you make a simple summary for this implementation? For example, introducing the the mechanism of the core implementation DeadNodeDetector and how it works with DFSClient and DFSInputStream. This is the key information we want to get. > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Attachments: HDFS-13571-2.6.diff > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14614) Add Secure Flag for DataNode Web UI Cookies
[ https://issues.apache.org/jira/browse/HDFS-14614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884381#comment-16884381 ] Lisheng Sun commented on HDFS-14614: hi [~jojochuang],I would like do this issue, could you assign this patch to me ? Thank you. > Add Secure Flag for DataNode Web UI Cookies > --- > > Key: HDFS-14614 > URL: https://issues.apache.org/jira/browse/HDFS-14614 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.7.0 >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > > It looks like HDFS-7279 removed Secure Flag for DataNode Web UI. I think we > should add it back. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory instead of df/du
[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883490#comment-16883490 ] Lisheng Sun edited comment on HDFS-14313 at 7/13/19 2:04 PM: - hi [~jojochuang] :) Could you continue to help review it? Thank you. was (Author: leosun08): Ping [~jojochuang]:) Could you continue to help review it? Thank you. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance >Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 >Reporter: Lisheng Sun >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13693) Remove unnecessary search in INodeDirectory.addChild during image loading
[ https://issues.apache.org/jira/browse/HDFS-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884379#comment-16884379 ] Lisheng Sun commented on HDFS-13693: Thank [~hexiaoqiao] for your suggestion. {quote} It seems to bring a little limit for INodeDirectorySection serialization. After apply this patch, it has to guarantee that serialize child inode by order, otherwise someone child inode which not in order could not be found by #binarySearch. I believe the current method #serializeINodeDirectorySection just done as said but without any limitation to update that logic. I am going to concern anyone else want to improve it, So is it better to add some annotation or other way to guard against this situation? {quote} This patch is just acting on loading fsimage that is should be serialized on disk. So I think it does not bring limit for INodeDirectorySection serialization. {quote} INodeReference could be optimized the same way? {quote} Sorry,I don't use snapshot and don't know it off the top of your head.I could not experiment it and am not sure that I can do this too. If you use a snapshotted fsimage, could you help experiment it? Thank you. > Remove unnecessary search in INodeDirectory.addChild during image loading > - > > Key: HDFS-13693 > URL: https://issues.apache.org/jira/browse/HDFS-13693 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: zhouyingchao >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-13693-001.patch, HDFS-13693-002.patch, > HDFS-13693-003.patch, HDFS-13693-004.patch > > > In FSImageFormatPBINode.loadINodeDirectorySection, all child INodes are added > to their parent INode's map one by one. The adding procedure will search a > position in the parent's map and then insert the child to the position. > However, during image loading, the search is unnecessary since the insert > position should always be at the end of the map given the sequence they are > serialized on disk. > Test this patch against a fsimage of a 70PB cluster (200million files and > 300million blocks), the image loading time be reduced from 1210 seconds to > 1138 seconds.So it can reduce up to about 10% of time. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-13571: --- Fix Version/s: (was: 3.0.2) > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Attachments: HDFS-13571-2.6.diff > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884305#comment-16884305 ] Lisheng Sun edited comment on HDFS-13571 at 7/13/19 1:16 PM: - Hi [~jojochuang] [~sodonnell] [~elgoiri] [~xkrogen] [~hexiaoqiao] Could you have time to pay attention to this issue? XIAOMI HBase availability is very helpful by Dead DataNode Detector. Thank you. If this issue is accepted, I will update new patch for trunk. was (Author: leosun08): Hi [~jojochuang] [~sodonnell] [~elgoiri] [~xkrogen] Could you have time to pay attention to this issue? XIAOMI HBase availability is very helpful by Dead DataNode Detector. Thank you. If this issue is accepted, I will update new patch for trunk. > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Fix For: 3.0.2 > > Attachments: HDFS-13571-2.6.diff > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13693) Remove unnecessary search in INodeDirectory.addChild during image loading
[ https://issues.apache.org/jira/browse/HDFS-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884365#comment-16884365 ] Lisheng Sun edited comment on HDFS-13693 at 7/13/19 1:15 PM: - I have updated the patch-004 for trunk. Could you have time to review this patch?[~jojochuang] [~xkrogen] Thank you. was (Author: leosun08): I have updated the patch-004 for trunk. Could you have time to continue reviewing this patch?[~jojochuang] Thank you. > Remove unnecessary search in INodeDirectory.addChild during image loading > - > > Key: HDFS-13693 > URL: https://issues.apache.org/jira/browse/HDFS-13693 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: zhouyingchao >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-13693-001.patch, HDFS-13693-002.patch, > HDFS-13693-003.patch, HDFS-13693-004.patch > > > In FSImageFormatPBINode.loadINodeDirectorySection, all child INodes are added > to their parent INode's map one by one. The adding procedure will search a > position in the parent's map and then insert the child to the position. > However, during image loading, the search is unnecessary since the insert > position should always be at the end of the map given the sequence they are > serialized on disk. > Test this patch against a fsimage of a 70PB cluster (200million files and > 300million blocks), the image loading time be reduced from 1210 seconds to > 1138 seconds.So it can reduce up to about 10% of time. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13693) Remove unnecessary search in INodeDirectory.addChild during image loading
[ https://issues.apache.org/jira/browse/HDFS-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884365#comment-16884365 ] Lisheng Sun commented on HDFS-13693: I have updated the patch-004 for trunk. Could you have time to continue reviewing this patch?[~jojochuang] Thank you. > Remove unnecessary search in INodeDirectory.addChild during image loading > - > > Key: HDFS-13693 > URL: https://issues.apache.org/jira/browse/HDFS-13693 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: zhouyingchao >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-13693-001.patch, HDFS-13693-002.patch, > HDFS-13693-003.patch, HDFS-13693-004.patch > > > In FSImageFormatPBINode.loadINodeDirectorySection, all child INodes are added > to their parent INode's map one by one. The adding procedure will search a > position in the parent's map and then insert the child to the position. > However, during image loading, the search is unnecessary since the insert > position should always be at the end of the map given the sequence they are > serialized on disk. > Test this patch against a fsimage of a 70PB cluster (200million files and > 300million blocks), the image loading time be reduced from 1210 seconds to > 1138 seconds.So it can reduce up to about 10% of time. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13693) Remove unnecessary search in INodeDirectory.addChild during image loading
[ https://issues.apache.org/jira/browse/HDFS-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884357#comment-16884357 ] Hadoop QA commented on HDFS-13693: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 50s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 10s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 12m 24s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 17s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 38s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}122m 27s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 37s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}185m 49s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.ha.TestBootstrapAliasmap | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | | | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations | | | hadoop.hdfs.web.TestWebHdfsTimeouts | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:bdbca0e | | JIRA Issue | HDFS-13693 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12974602/HDFS-13693-004.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 857fe7219106 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 4a70a0d | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_212 | | findbugs | v3.1.0-RC1 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/27222/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/27222/testReport/ | |
[jira] [Commented] (HDFS-14547) DirectoryWithQuotaFeature.quota costs additional memory even the storage type quota is not set.
[ https://issues.apache.org/jira/browse/HDFS-14547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884341#comment-16884341 ] Hadoop QA commented on HDFS-14547: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 29m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} branch-2.9 Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 49s{color} | {color:green} branch-2.9 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s{color} | {color:green} branch-2.9 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s{color} | {color:green} branch-2.9 passed with JDK v1.8.0_212 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 29s{color} | {color:green} branch-2.9 passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green} branch-2.9 passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 55s{color} | {color:green} branch-2.9 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s{color} | {color:green} branch-2.9 passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} branch-2.9 passed with JDK v1.8.0_212 {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 43s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 43s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 2s{color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} the patch passed with JDK v1.8.0_212 {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 80m 37s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}139m 22s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeHttpServerXFrame | | | hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys | | | hadoop.hdfs.TestMaintenanceState | | | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.server.datanode.TestDirectoryScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=18.09.7 Server=18.09.7 Image:yetus/hadoop:c3439ff | | JIRA Issue | HDFS-14547 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12974600/HDFS-14547-branch-2.9.002.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 46bade4d4393 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Updated] (HDFS-13693) Remove unnecessary search in INodeDirectory.addChild during image loading
[ https://issues.apache.org/jira/browse/HDFS-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-13693: --- Attachment: HDFS-13693-004.patch > Remove unnecessary search in INodeDirectory.addChild during image loading > - > > Key: HDFS-13693 > URL: https://issues.apache.org/jira/browse/HDFS-13693 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: zhouyingchao >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-13693-001.patch, HDFS-13693-002.patch, > HDFS-13693-003.patch, HDFS-13693-004.patch > > > In FSImageFormatPBINode.loadINodeDirectorySection, all child INodes are added > to their parent INode's map one by one. The adding procedure will search a > position in the parent's map and then insert the child to the position. > However, during image loading, the search is unnecessary since the insert > position should always be at the end of the map given the sequence they are > serialized on disk. > Test this patch against a fsimage of a 70PB cluster (200million files and > 300million blocks), the image loading time be reduced from 1210 seconds to > 1138 seconds.So it can reduce up to about 10% of time. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13693) Remove unnecessary search in INodeDirectory.addChild during image loading
[ https://issues.apache.org/jira/browse/HDFS-13693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884329#comment-16884329 ] Lisheng Sun commented on HDFS-13693: Thank [~hunhun] for your comments. {quote}If load fsimage in parallel HDFS-7784 , Can't it guarantee that serialize child inode by order? {quote} HDFS-7784 that is that load fsimage in parallel is is not conflicting with this patch. The two patch are optimized when deserialize. > Remove unnecessary search in INodeDirectory.addChild during image loading > - > > Key: HDFS-13693 > URL: https://issues.apache.org/jira/browse/HDFS-13693 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: zhouyingchao >Assignee: Lisheng Sun >Priority: Major > Attachments: HDFS-13693-001.patch, HDFS-13693-002.patch, > HDFS-13693-003.patch > > > In FSImageFormatPBINode.loadINodeDirectorySection, all child INodes are added > to their parent INode's map one by one. The adding procedure will search a > position in the parent's map and then insert the child to the position. > However, during image loading, the search is unnecessary since the insert > position should always be at the end of the map given the sequence they are > serialized on disk. > Test this patch against a fsimage of a 70PB cluster (200million files and > 300million blocks), the image loading time be reduced from 1210 seconds to > 1138 seconds.So it can reduce up to about 10% of time. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14547) DirectoryWithQuotaFeature.quota costs additional memory even the storage type quota is not set.
[ https://issues.apache.org/jira/browse/HDFS-14547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884322#comment-16884322 ] Jinglun commented on HDFS-14547: Hi [~xkrogen], thanks for your great suggestions. Fixing checkstyle, compliation issues and removing declaration of ConstEnumException in HDFS-14547-branch-2.9.002.patch. Pending jenkins. > DirectoryWithQuotaFeature.quota costs additional memory even the storage type > quota is not set. > --- > > Key: HDFS-14547 > URL: https://issues.apache.org/jira/browse/HDFS-14547 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.1.0 >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14547-branch-2.9.001.patch, > HDFS-14547-branch-2.9.002.patch, HDFS-14547-design, HDFS-14547-patch003-Test > Report.pdf, HDFS-14547.001.patch, HDFS-14547.002.patch, HDFS-14547.003.patch, > HDFS-14547.004.patch, HDFS-14547.005.patch, HDFS-14547.006.patch, > HDFS-14547.007.patch > > > Our XiaoMi HDFS is considering upgrading from 2.6 to 3.1. We notice the > storage type quota 'tsCounts' is instantiated to > EnumCounters(StorageType.class), so it will cost a long[5] even > if we don't have any storage type quota on this inode(only space quota or > name quota). > In our cluster we have many dirs with quota and the NameNode's memory is in > tension, so the additional cost will be a problem. > See DirectoryWithQuotaFeature.Builder(). > > {code:java} > class DirectoryWithQuotaFeature$Builder { > public Builder() { >this.quota = new QuotaCounts.Builder().nameSpace(DEFAULT_NAMESPACE_QUOTA). >storageSpace(DEFAULT_STORAGE_SPACE_QUOTA). >typeSpaces(DEFAULT_STORAGE_SPACE_QUOTA).build();// set default value -1. >this.usage = new QuotaCounts.Builder().nameSpace(1).build(); > } > public Builder typeSpaces(long val) {// set default value. >this.tsCounts.reset(val); >return this; > } > } > class QuotaCounts$Builder { > public Builder() { > this.nsSsCounts = new EnumCounters(Quota.class); > this.tsCounts = new EnumCounters(StorageType.class); > } > } > class EnumCounters { > public EnumCounters(final Class enumClass) { > final E[] enumConstants = enumClass.getEnumConstants(); > Preconditions.checkNotNull(enumConstants); > this.enumClass = enumClass; > this.counters = new long[enumConstants.length];// new a long array here. > } > } > {code} > Related to HDFS-14542. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14547) DirectoryWithQuotaFeature.quota costs additional memory even the storage type quota is not set.
[ https://issues.apache.org/jira/browse/HDFS-14547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jinglun updated HDFS-14547: --- Attachment: HDFS-14547-branch-2.9.002.patch > DirectoryWithQuotaFeature.quota costs additional memory even the storage type > quota is not set. > --- > > Key: HDFS-14547 > URL: https://issues.apache.org/jira/browse/HDFS-14547 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.1.0 >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Fix For: 3.0.4, 3.3.0, 3.2.1, 3.1.3 > > Attachments: HDFS-14547-branch-2.9.001.patch, > HDFS-14547-branch-2.9.002.patch, HDFS-14547-design, HDFS-14547-patch003-Test > Report.pdf, HDFS-14547.001.patch, HDFS-14547.002.patch, HDFS-14547.003.patch, > HDFS-14547.004.patch, HDFS-14547.005.patch, HDFS-14547.006.patch, > HDFS-14547.007.patch > > > Our XiaoMi HDFS is considering upgrading from 2.6 to 3.1. We notice the > storage type quota 'tsCounts' is instantiated to > EnumCounters(StorageType.class), so it will cost a long[5] even > if we don't have any storage type quota on this inode(only space quota or > name quota). > In our cluster we have many dirs with quota and the NameNode's memory is in > tension, so the additional cost will be a problem. > See DirectoryWithQuotaFeature.Builder(). > > {code:java} > class DirectoryWithQuotaFeature$Builder { > public Builder() { >this.quota = new QuotaCounts.Builder().nameSpace(DEFAULT_NAMESPACE_QUOTA). >storageSpace(DEFAULT_STORAGE_SPACE_QUOTA). >typeSpaces(DEFAULT_STORAGE_SPACE_QUOTA).build();// set default value -1. >this.usage = new QuotaCounts.Builder().nameSpace(1).build(); > } > public Builder typeSpaces(long val) {// set default value. >this.tsCounts.reset(val); >return this; > } > } > class QuotaCounts$Builder { > public Builder() { > this.nsSsCounts = new EnumCounters(Quota.class); > this.tsCounts = new EnumCounters(StorageType.class); > } > } > class EnumCounters { > public EnumCounters(final Class enumClass) { > final E[] enumConstants = enumClass.getEnumConstants(); > Preconditions.checkNotNull(enumConstants); > this.enumClass = enumClass; > this.counters = new long[enumConstants.length];// new a long array here. > } > } > {code} > Related to HDFS-14542. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14593) RBF: Implement deletion feature for expired records in State Store
[ https://issues.apache.org/jira/browse/HDFS-14593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884321#comment-16884321 ] Takanobu Asanuma commented on HDFS-14593: - I ran {{TestRouterRpc#testNamenodeMetrics}} without the patch in my local computer and it sometimes passed and sometimes failed. Seems it is flaky. I will file it later. Anyway, I don't think it relates to the latest patch. > RBF: Implement deletion feature for expired records in State Store > -- > > Key: HDFS-14593 > URL: https://issues.apache.org/jira/browse/HDFS-14593 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Takanobu Asanuma >Assignee: Takanobu Asanuma >Priority: Major > Attachments: HDFS-14593.001.patch, HDFS-14593.002.patch, > HDFS-14593.003.patch, HDFS-14593.004.patch, HDFS-14593.005.patch, > HDFS-14593.006.patch, HDFS-14593.007.patch, HDFS-14593.008.patch, > HDFS-14593.009.patch, HDFS-14593.010.patch, HDFS-14593.011.patch > > > Currently, any router seems to exist in the Router Information eternally. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884305#comment-16884305 ] Lisheng Sun edited comment on HDFS-13571 at 7/13/19 6:36 AM: - Hi [~jojochuang] [~sodonnell] [~elgoiri] [~xkrogen] Could you have time to pay attention to this issue? XIAOMI HBase availability is very helpful by Dead DataNode Detector. Thank you. If this issue is accepted, I will update new patch for trunk. was (Author: leosun08): Hi [~jojochuang] [~sodonnell] [~elgoiri] [~xkrogen] Could you have time to pay attention to this issue? XIAOMI HBase usability is very helpful by Dead DataNode Detector. Thank you. > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Fix For: 3.0.2 > > Attachments: HDFS-13571-2.6.diff > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lisheng Sun updated HDFS-13571: --- Summary: Dead DataNode Detector (was: Dead datanode detector) > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Fix For: 3.0.2 > > Attachments: HDFS-13571-2.6.diff > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-13571) Dead DataNode Detector
[ https://issues.apache.org/jira/browse/HDFS-13571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884305#comment-16884305 ] Lisheng Sun commented on HDFS-13571: Hi [~jojochuang] [~sodonnell] [~elgoiri] [~xkrogen] Could you have time to pay attention to this issue? XIAOMI HBase usability is very helpful by Dead DataNode Detector. Thank you. > Dead DataNode Detector > -- > > Key: HDFS-13571 > URL: https://issues.apache.org/jira/browse/HDFS-13571 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.4.0, 2.6.0, 3.0.2 >Reporter: Gang Xie >Assignee: Lisheng Sun >Priority: Minor > Fix For: 3.0.2 > > Attachments: HDFS-13571-2.6.diff > > > Currently, the information of the dead datanode in DFSInputStream in stored > locally. So, it could not be shared among the inputstreams of the same > DFSClient. In our production env, every days, some datanodes dies with > different causes. At this time, after the first inputstream blocked and > detect this, it could share this information to others in the same DFSClient, > thus, the ohter inputstreams are still blocked by the dead node for some > time, which could cause bad service latency. > To eliminate this impact from dead datanode, we designed a dead datanode > detector, which detect the dead ones in advance, and share this information > among all the inputstreams in the same client. This improvement has being > online for some months and works fine. So, we decide to port to the 3.0 (the > version used in our production env is 2.4 and 2.6). > I will do the porting work and upload the code later. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14644) That replication of block failed leads to decommission is blocked when the number of replicas of block is greater than the number of datanode
[ https://issues.apache.org/jira/browse/HDFS-14644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884303#comment-16884303 ] Lisheng Sun commented on HDFS-14644: Thank [~jojochuang] [~sodonnell] for paying close attention to this issue. I also agree this is expected behavior. My idea is to let users know about this issue that decommission will never complete, because the target replicas are more than the live nodes. The solution I came up with: # The blocks whose target replicas are more than the live nodes are showed on web. # The blocks that cannot be replicated when the node is on decommission are known by command. This solution is similar to [~sodonnell] idea. I think the purpose of this issue is to be more user friendly when decommission never complete,because blocks that cannot be replicated. Please correct me if I am wrong. Thanks. > That replication of block failed leads to decommission is blocked when the > number of replicas of block is greater than the number of datanode > - > > Key: HDFS-14644 > URL: https://issues.apache.org/jira/browse/HDFS-14644 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.1.1, 2.9.2, 3.0.3, 2.8.5, 2.7.7 >Reporter: Lisheng Sun >Priority: Major > > 2019-07-10,15:37:18,028 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 5 to reach 10 > (unavailableStorages=[DISK, ARCHIVE], > storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], > creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=false) All > required storage types are unavailable: unavailableStorages=[DISK, ARCHIVE], > storagePolicy=BlockStoragePolicy\{HOT:7, storageTypes=[DISK], > creationFallbacks=[], replicationFallbacks=[ARCHIVE]} > 2019-07-10,15:37:18,028 WARN > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to > place enough replicas, still in need of 5 to reach 10 > (unavailableStorages=[], storagePolicy=BlockStoragePolicy\{HOT:7, > storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, > newBlock=false) For more information, please enable DEBUG log level on > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org