[jira] [Commented] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974888#comment-16974888 ] Aiphago commented on HDFS-14986: Here is the patch for the trunk can [~leosun08] [~hexiaoqiao] [~linyiqun] review this one? Thanks very much.[^HDFS-14986.001.patch] > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > Attachments: HDFS-14986.001.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
[ https://issues.apache.org/jira/browse/HDFS-14986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aiphago updated HDFS-14986: --- Attachment: HDFS-14986.001.patch > ReplicaCachingGetSpaceUsed throws ConcurrentModificationException > -- > > Key: HDFS-14986 > URL: https://issues.apache.org/jira/browse/HDFS-14986 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Reporter: Ryan Wu >Assignee: Ryan Wu >Priority: Major > Attachments: HDFS-14986.001.patch > > > Running DU across lots of disks is very expensive . We applied the patch > HDFS-14313 to get used space from ReplicaInfo in memory.However, new du > threads throw the exception > {code:java} > // 2019-11-08 18:07:13,858 ERROR > [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992--1450855658517] > > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: > ReplicaCachingGetSpaceUsed refresh error > java.util.ConcurrentModificationException: Tree has been modified outside of > iterator > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) > > at > org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) > > at java.util.AbstractCollection.addAll(AbstractCollection.java:343) > at java.util.HashSet.(HashSet.java:120) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) > > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) > > at > org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) > > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14952) Skip safemode if blockTotal is 0 in new NN
[ https://issues.apache.org/jira/browse/HDFS-14952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974868#comment-16974868 ] Xiaoqiao He commented on HDFS-14952: [^HDFS-14952.003.patch] try to fix failed unit test #TestHASafeMode when blockTotal=0. Others seems unrelated. > Skip safemode if blockTotal is 0 in new NN > -- > > Key: HDFS-14952 > URL: https://issues.apache.org/jira/browse/HDFS-14952 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Rajesh Balamohan >Assignee: Xiaoqiao He >Priority: Trivial > Labels: performance > Attachments: HDFS-14952.001.patch, HDFS-14952.002.patch, > HDFS-14952.003.patch > > > When new NN is installed, it spends 30-45 seconds in Safemode. When > {{blockTotal}} is 0, it should be possible to short circuit safemode check in > {{BlockManagerSafeMode::areThresholdsMet}}. > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerSafeMode.java#L571 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14952) Skip safemode if blockTotal is 0 in new NN
[ https://issues.apache.org/jira/browse/HDFS-14952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HDFS-14952: --- Attachment: HDFS-14952.003.patch > Skip safemode if blockTotal is 0 in new NN > -- > > Key: HDFS-14952 > URL: https://issues.apache.org/jira/browse/HDFS-14952 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Rajesh Balamohan >Assignee: Xiaoqiao He >Priority: Trivial > Labels: performance > Attachments: HDFS-14952.001.patch, HDFS-14952.002.patch, > HDFS-14952.003.patch > > > When new NN is installed, it spends 30-45 seconds in Safemode. When > {{blockTotal}} is 0, it should be possible to short circuit safemode check in > {{BlockManagerSafeMode::areThresholdsMet}}. > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerSafeMode.java#L571 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2498) Sonar: Fix issues found in StorageContainerManager class
[ https://issues.apache.org/jira/browse/HDDS-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-2498: --- Labels: pull-request-available sonar (was: pull-request-available) > Sonar: Fix issues found in StorageContainerManager class > > > Key: HDDS-2498 > URL: https://issues.apache.org/jira/browse/HDDS-2498 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > https://sonarcloud.io/project/issues?fileUuids=AW5md-HfKcVY8lQ4ZrcG=hadoop-ozone=AW5md-tIKcVY8lQ4ZsEr=false -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14947) infrequent data loss due to rename functionality breaking
[ https://issues.apache.org/jira/browse/HDFS-14947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974849#comment-16974849 ] Surendra Singh Lilhore commented on HDFS-14947: --- Hi [~abhishek.sahani], If you have log, can you check if any delete operation logged for parent directory ? > infrequent data loss due to rename functionality breaking > - > > Key: HDFS-14947 > URL: https://issues.apache.org/jira/browse/HDFS-14947 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 2.7.3 >Reporter: abhishek sahani >Priority: Critical > > We are facing an issue where data is getting lost from hdfs during rename , > in namenode logs we check file is renamed successfully but in hdfs after > rename file is not present at destination location and thus we are loosing > the data. > > namenode logs: > 19/10/31 16:54:09 DEBUG top.TopAuditLogger: --- logged event > for top service: allowed=true ugi=root (auth:SIMPLE) ip=/*.*.*.* cmd=rename > src=/topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet > > dst=/topics/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic+9+00+99.parquet > perm=root:supergroup:rw-r--r-- > > 19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR* NameSystem.renameTo: > /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet > to > /topics/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic+9+00+99.parquet > 19/10/31 16:54:09 DEBUG ipc.Server: IPC Server handler 8 on 9000: responding > to org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from > *.*.*.*:39854 Call#48333 Retry#0 > 19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR* FSDirectory.renameTo: > /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet > to > /topics/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic+9+00+99.parquet > 19/10/31 16:54:09 DEBUG ipc.Server: IPC Server handler 6 on 9000: > org.apache.hadoop.hdfs.protocol.ClientProtocol.getFileInfo from *.*.*.*:39854 > Call#48337 Retry#0 for RpcKind RPC_PROTOCOL_BUFFER > 19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR* > FSDirectory.unprotectedRenameTo: > /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet > is renamed to > /topics/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic+9+00+99.parquet > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-14956) RBF: Router Safemode status should display properly without any unnecessary time-stamp and info log
[ https://issues.apache.org/jira/browse/HDFS-14956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore resolved HDFS-14956. --- Resolution: Not A Problem > RBF: Router Safemode status should display properly without any unnecessary > time-stamp and info log > --- > > Key: HDFS-14956 > URL: https://issues.apache.org/jira/browse/HDFS-14956 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Affects Versions: 3.1.1 > Environment: RBF Cluster >Reporter: Souryakanta Dwivedy >Priority: Minor > Attachments: RBF_Safemode_log.PNG > > > Router Safemode status should display properly without any unnecesary > time-stamp and info log > Step:- > * Make the Router Safemode On/Off > * Get the Safemode info and check the output format > Actual output :- Actual output :- > ./hdfs dfsrouteradmin -safemode get2019-11-06 17:00:20,209 INFO > federation.RouterAdmin: Router > org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB@31304f14 > safemode status : trueSafe Mode: true > Expected Output :- Router safemode status : true > Safe Mode: true -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14956) RBF: Router Safemode status should display properly without any unnecessary time-stamp and info log
[ https://issues.apache.org/jira/browse/HDFS-14956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974841#comment-16974841 ] Surendra Singh Lilhore commented on HDFS-14956: --- [~SouryakantaDwivedy], you are getting this because in your log4j configuration {{hadoop.root.logger}} configured with {{console}}. If you change it to {{RFA}} then you will not get this log on console. one more option is you can set {{HADOOP_ROOT_LOGGER}} with "INFO, RFA" to avoid this. > RBF: Router Safemode status should display properly without any unnecessary > time-stamp and info log > --- > > Key: HDFS-14956 > URL: https://issues.apache.org/jira/browse/HDFS-14956 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Affects Versions: 3.1.1 > Environment: RBF Cluster >Reporter: Souryakanta Dwivedy >Priority: Minor > Attachments: RBF_Safemode_log.PNG > > > Router Safemode status should display properly without any unnecesary > time-stamp and info log > Step:- > * Make the Router Safemode On/Off > * Get the Safemode info and check the output format > Actual output :- Actual output :- > ./hdfs dfsrouteradmin -safemode get2019-11-06 17:00:20,209 INFO > federation.RouterAdmin: Router > org.apache.hadoop.hdfs.protocolPB.RouterAdminProtocolTranslatorPB@31304f14 > safemode status : trueSafe Mode: true > Expected Output :- Router safemode status : true > Safe Mode: true -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDDS-2492) Fix test clean up issue in TestSCMPipelineManager
[ https://issues.apache.org/jira/browse/HDDS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDDS-2492 started by Li Cheng. -- > Fix test clean up issue in TestSCMPipelineManager > - > > Key: HDDS-2492 > URL: https://issues.apache.org/jira/browse/HDDS-2492 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Sammi Chen >Assignee: Li Cheng >Priority: Major > > This was opened based on [~sammichen]'s investigation on HDDS-2034. > > {quote}Failure is caused by newly introduced function > TestSCMPipelineManager#testPipelineOpenOnlyWhenLeaderReported which doesn't > close pipelineManager at the end. It's better to fix it in a new JIRA. > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2492) Fix test clean up issue in TestSCMPipelineManager
[ https://issues.apache.org/jira/browse/HDDS-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974837#comment-16974837 ] Li Cheng commented on HDDS-2492: [https://github.com/apache/hadoop-ozone/pull/179] > Fix test clean up issue in TestSCMPipelineManager > - > > Key: HDDS-2492 > URL: https://issues.apache.org/jira/browse/HDDS-2492 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Sammi Chen >Assignee: Li Cheng >Priority: Major > > This was opened based on [~sammichen]'s investigation on HDDS-2034. > > {quote}Failure is caused by newly introduced function > TestSCMPipelineManager#testPipelineOpenOnlyWhenLeaderReported which doesn't > close pipelineManager at the end. It's better to fix it in a new JIRA. > {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2497) SafeMode check should allow key creation on single node pipeline when replication factor is 1
[ https://issues.apache.org/jira/browse/HDDS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Wagle reassigned HDDS-2497: - Assignee: Siddharth Wagle > SafeMode check should allow key creation on single node pipeline when > replication factor is 1 > - > > Key: HDDS-2497 > URL: https://issues.apache.org/jira/browse/HDDS-2497 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Xiaoyu Yao >Assignee: Siddharth Wagle >Priority: Major > > Start a single datanode ozone docker-compose with replication factor of 1. > {code:java} > OZONE-SITE.XML_ozone.replication=1{code} > The key creation failed with Safemode exception below. > {code:java} > >$ docker-compose exec om bash > bash-4.2$ ozone sh vol create /vol1 > bash-4.2$ ozone sh bucket create /vol1/bucket1 > ozone sh kbash-4.2$ ozone sh key put /vol1/bucket1/key1 README.md > SCM_IN_SAFE_MODE SafeModePrecheck failed for allocateBlock{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974829#comment-16974829 ] Hadoop QA commented on HDFS-14740: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 8s{color} | {color:red} HDFS-14740 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HDFS-14740 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/28313/console | | Powered by | Apache Yetus 0.8.0 http://yetus.apache.org | This message was automatically generated. > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly
[ https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974828#comment-16974828 ] Hadoop QA commented on HDFS-14973: -- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 41s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 11s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 34s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 13s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 0m 48s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch generated 1 new + 699 unchanged - 1 fixed = 700 total (was 700) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s{color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 24s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 99m 20s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 33s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}161m 12s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer | | | hadoop.hdfs.server.aliasmap.TestSecureAliasMap | | | hadoop.hdfs.TestDFSUpgradeFromImage | \\ \\ || Subsystem || Report/Notes || | Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 | | JIRA Issue | HDFS-14973 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12985872/HDFS-14973.003.patch | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux c4b92acf86d2 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / d0302d3 | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_222 | | findbugs | v3.1.0-RC1 | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/28312/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt | | unit |
[jira] [Commented] (HDFS-14985) FSCK for a block of EC Files doesnt display status at the end
[ https://issues.apache.org/jira/browse/HDFS-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974819#comment-16974819 ] Surendra Singh Lilhore commented on HDFS-14985: --- it is duplicate of HDFS-14266 > FSCK for a block of EC Files doesnt display status at the end > - > > Key: HDFS-14985 > URL: https://issues.apache.org/jira/browse/HDFS-14985 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ravuri Sushma sree >Assignee: Ravuri Sushma sree >Priority: Major > > *Environment* Cluster of 2 Namenodes and 5 Datanodes and ec policy enabled > fsck -blockId of a block associated with an EC File does not print status at > the end and displays null instead > {color:#de350b}*Result :*{color} > ./hdfs fsck -blockId blk_-x > Connecting to namenode via > FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST > 2019 > Block Id: blk_-x > Block belongs to: /ecdir/f2 > No. of Expected Replica: 3 > No. of live Replica: 3 > No. of excess Replica: 0 > No. of stale Replica: 2 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > null > {color:#de350b}*Expected :*{color} > ./hdfs fsck -blockId blk_-x > Connecting to namenode via > FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST > 2019 > Block Id: blk_-x > Block belongs to: /ecdir/f2 > No. of Expected Replica: 3 > No. of live Replica: 3 > No. of excess Replica: 0 > No. of stale Replica: 2 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > Block replica on datanode/rack: vm10/default-rack is HEALTHY > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-14985) FSCK for a block of EC Files doesnt display status at the end
[ https://issues.apache.org/jira/browse/HDFS-14985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore resolved HDFS-14985. --- Resolution: Duplicate > FSCK for a block of EC Files doesnt display status at the end > - > > Key: HDFS-14985 > URL: https://issues.apache.org/jira/browse/HDFS-14985 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ravuri Sushma sree >Assignee: Ravuri Sushma sree >Priority: Major > > *Environment* Cluster of 2 Namenodes and 5 Datanodes and ec policy enabled > fsck -blockId of a block associated with an EC File does not print status at > the end and displays null instead > {color:#de350b}*Result :*{color} > ./hdfs fsck -blockId blk_-x > Connecting to namenode via > FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST > 2019 > Block Id: blk_-x > Block belongs to: /ecdir/f2 > No. of Expected Replica: 3 > No. of live Replica: 3 > No. of excess Replica: 0 > No. of stale Replica: 2 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > null > {color:#de350b}*Expected :*{color} > ./hdfs fsck -blockId blk_-x > Connecting to namenode via > FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST > 2019 > Block Id: blk_-x > Block belongs to: /ecdir/f2 > No. of Expected Replica: 3 > No. of live Replica: 3 > No. of excess Replica: 0 > No. of stale Replica: 2 > No. of decommissioned Replica: 0 > No. of decommissioning Replica: 0 > No. of corrupted Replica: 0 > Block replica on datanode/rack: vm10/default-rack is HEALTHY > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974816#comment-16974816 ] Feilong He commented on HDFS-14740: --- [^HDFS_Persistent_Read-Cache_Test-v2.pdf] has been uploaded for your reference. > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-14987) EC: EC file blockId location info displaying as "null" with hdfs fsck -blockId command
[ https://issues.apache.org/jira/browse/HDFS-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Surendra Singh Lilhore resolved HDFS-14987. --- Resolution: Duplicate > EC: EC file blockId location info displaying as "null" with hdfs fsck > -blockId command > -- > > Key: HDFS-14987 > URL: https://issues.apache.org/jira/browse/HDFS-14987 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec, tools >Affects Versions: 3.1.2 >Reporter: Souryakanta Dwivedy >Assignee: Ravuri Sushma sree >Priority: Major > Attachments: EC_file_block_info.PNG, > image-2019-11-13-18-34-00-067.png, image-2019-11-13-18-36-29-063.png, > image-2019-11-13-18-38-18-899.png > > > EC file blockId location info displaying as "null" with hdfs fsck -blockId > command > * Check the blockId information of an EC enabled file with "hdfs fsck > --blockId"- Check the blockId information of an EC enabled file with "hdfs > fsck -blockId" blockId location related info will display as null,which > needs to be rectified. > Check the attachment "EC_file_block_info" > === > !image-2019-11-13-18-34-00-067.png! > > * Check the output of a normal file block to compare > !image-2019-11-13-18-36-29-063.png! > === > !image-2019-11-13-18-38-18-899.png! > * Actual Output :- null > * Expected output :- It should display the blockId location related info as > (nodes, racks) of the block as specified in the usage info of fsck -blockId > option. [like : Block replica on > datanode/rack: BLR1xx038/default-rack is HEALTHY] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14987) EC: EC file blockId location info displaying as "null" with hdfs fsck -blockId command
[ https://issues.apache.org/jira/browse/HDFS-14987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974815#comment-16974815 ] Surendra Singh Lilhore commented on HDFS-14987: --- its duplicate of HDFS-14266 > EC: EC file blockId location info displaying as "null" with hdfs fsck > -blockId command > -- > > Key: HDFS-14987 > URL: https://issues.apache.org/jira/browse/HDFS-14987 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec, tools >Affects Versions: 3.1.2 >Reporter: Souryakanta Dwivedy >Assignee: Ravuri Sushma sree >Priority: Major > Attachments: EC_file_block_info.PNG, > image-2019-11-13-18-34-00-067.png, image-2019-11-13-18-36-29-063.png, > image-2019-11-13-18-38-18-899.png > > > EC file blockId location info displaying as "null" with hdfs fsck -blockId > command > * Check the blockId information of an EC enabled file with "hdfs fsck > --blockId"- Check the blockId information of an EC enabled file with "hdfs > fsck -blockId" blockId location related info will display as null,which > needs to be rectified. > Check the attachment "EC_file_block_info" > === > !image-2019-11-13-18-34-00-067.png! > > * Check the output of a normal file block to compare > !image-2019-11-13-18-36-29-063.png! > === > !image-2019-11-13-18-38-18-899.png! > * Actual Output :- null > * Expected output :- It should display the blockId location related info as > (nodes, racks) of the block as specified in the usage info of fsck -blockId > option. [like : Block replica on > datanode/rack: BLR1xx038/default-rack is HEALTHY] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14740) Recover data blocks from persistent memory read cache during datanode restarts
[ https://issues.apache.org/jira/browse/HDFS-14740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feilong He updated HDFS-14740: -- Attachment: HDFS_Persistent_Read-Cache_Test-v2.pdf > Recover data blocks from persistent memory read cache during datanode restarts > -- > > Key: HDFS-14740 > URL: https://issues.apache.org/jira/browse/HDFS-14740 > Project: Hadoop HDFS > Issue Type: Improvement > Components: caching, datanode >Reporter: Feilong He >Assignee: Rui Mo >Priority: Major > Attachments: HDFS-14740.000.patch, HDFS-14740.001.patch, > HDFS-14740.002.patch, HDFS-14740.003.patch, HDFS-14740.004.patch, > HDFS-14740.005.patch, HDFS-14740.006.patch, > HDFS_Persistent_Read-Cache_Design-v1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.1.pdf, > HDFS_Persistent_Read-Cache_Test-v1.pdf, HDFS_Persistent_Read-Cache_Test-v2.pdf > > > In HDFS-13762, persistent memory (PM) is enabled in HDFS centralized cache > management. Even though PM can persist cache data, for simplifying the > initial implementation, the previous cache data will be cleaned up during > DataNode restarts. Here, we are proposing to improve HDFS PM cache by taking > advantage of PM's data persistence characteristic, i.e., recovering the > status for cached data, if any, when DataNode restarts, thus, cache warm up > time can be saved for user. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2499) IsLeader information is lost when update pipeline state
Sammi Chen created HDDS-2499: Summary: IsLeader information is lost when update pipeline state Key: HDDS-2499 URL: https://issues.apache.org/jira/browse/HDDS-2499 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Sammi Chen Assignee: Sammi Chen -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2487) Ensure streams are closed
[ https://issues.apache.org/jira/browse/HDDS-2487?focusedWorklogId=343990=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343990 ] ASF GitHub Bot logged work on HDDS-2487: Author: ASF GitHub Bot Created on: 15/Nov/19 05:16 Start Date: 15/Nov/19 05:16 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #173: HDDS-2487. Ensure streams are closed URL: https://github.com/apache/hadoop-ozone/pull/173 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343990) Time Spent: 20m (was: 10m) > Ensure streams are closed > - > > Key: HDDS-2487 > URL: https://issues.apache.org/jira/browse/HDDS-2487 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > * ContainerDataYaml: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-6IKcVY8lQ4ZsQU=AW5md-6IKcVY8lQ4ZsQU > * OmUtils: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-hdKcVY8lQ4Zr76=AW5md-hdKcVY8lQ4Zr76 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2487) Ensure streams are closed
[ https://issues.apache.org/jira/browse/HDDS-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham updated HDDS-2487: - Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Ensure streams are closed > - > > Key: HDDS-2487 > URL: https://issues.apache.org/jira/browse/HDDS-2487 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > * ContainerDataYaml: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-6IKcVY8lQ4ZsQU=AW5md-6IKcVY8lQ4ZsQU > * OmUtils: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-hdKcVY8lQ4Zr76=AW5md-hdKcVY8lQ4Zr76 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2494) Sonar - BigDecimal(double) should not be used
[ https://issues.apache.org/jira/browse/HDDS-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham resolved HDDS-2494. -- Fix Version/s: 0.5.0 Resolution: Fixed > Sonar - BigDecimal(double) should not be used > - > > Key: HDDS-2494 > URL: https://issues.apache.org/jira/browse/HDDS-2494 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Matthew Sharp >Assignee: Matthew Sharp >Priority: Minor > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Sonar Issue: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-0AKcVY8lQ4ZsKR=AW5md-0AKcVY8lQ4ZsKR] > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2494) Sonar - BigDecimal(double) should not be used
[ https://issues.apache.org/jira/browse/HDDS-2494?focusedWorklogId=343987=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343987 ] ASF GitHub Bot logged work on HDDS-2494: Author: ASF GitHub Bot Created on: 15/Nov/19 05:11 Start Date: 15/Nov/19 05:11 Worklog Time Spent: 10m Work Description: bharatviswa504 commented on pull request #175: HDDS-2494 Sonar BigDecimal Cleanup URL: https://github.com/apache/hadoop-ozone/pull/175 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343987) Time Spent: 20m (was: 10m) > Sonar - BigDecimal(double) should not be used > - > > Key: HDDS-2494 > URL: https://issues.apache.org/jira/browse/HDDS-2494 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Matthew Sharp >Assignee: Matthew Sharp >Priority: Minor > Labels: pull-request-available, sonar > Time Spent: 20m > Remaining Estimate: 0h > > Sonar Issue: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-0AKcVY8lQ4ZsKR=AW5md-0AKcVY8lQ4ZsKR] > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HDFS-14884: --- Fix Version/s: 2.10.1 Resolution: Fixed Status: Resolved (was: Patch Available) Pushed the change to branch-2 and branch-2.10 Thanks [~yuvaldeg]. By the way, next time please let committer update fix version for you. > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Affects Versions: 2.11.0 >Reporter: Mukul Kumar Singh >Assignee: Yuval Degani >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.10.1, 2.11.0 > > Attachments: HDFS-14884-branch-2.001.patch, HDFS-14884.001.patch, > HDFS-14884.002.patch, HDFS-14884.003.patch, hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974787#comment-16974787 ] Wei-Chiu Chuang commented on HDFS-14884: +1 > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Affects Versions: 2.11.0 >Reporter: Mukul Kumar Singh >Assignee: Yuval Degani >Priority: Major > Fix For: 3.3.0, 3.1.4, 3.2.2, 2.11.0 > > Attachments: HDFS-14884-branch-2.001.patch, HDFS-14884.001.patch, > HDFS-14884.002.patch, HDFS-14884.003.patch, hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14924) RenameSnapshot not updating new modification time
[ https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974766#comment-16974766 ] Takanobu Asanuma commented on HDFS-14924: - The latest patch looks good to me except for the checkstyle issue. The failed test of TestOfflineEditsViewer may be caused by HDFS-14922. > RenameSnapshot not updating new modification time > - > > Key: HDFS-14924 > URL: https://issues.apache.org/jira/browse/HDFS-14924 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14924.001.patch, HDFS-14924.002.patch > > > RenameSnapshot doesnt updating modification time -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-1492) Generated chunk size name too long.
[ https://issues.apache.org/jira/browse/HDDS-1492?focusedWorklogId=343974=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343974 ] ASF GitHub Bot logged work on HDDS-1492: Author: ASF GitHub Bot Created on: 15/Nov/19 04:05 Start Date: 15/Nov/19 04:05 Worklog Time Spent: 10m Work Description: timmylicheng commented on pull request #179: HDDS-1492 Fix test clean up issue in TestSCMPipelineManager. URL: https://github.com/apache/hadoop-ozone/pull/179 ## What changes were proposed in this pull request? Close pipeline in TestSCMPipelineManager#testPipelineOpenOnlyWhenLeaderReported. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2492 (Please create an issue in ASF JIRA before opening a pull request, and you need to set the title of the pull request which starts with the corresponding JIRA issue number. (e.g. HDDS-. Fix a typo in YYY.) Please replace this section with the link to the Apache JIRA) ## How was this patch tested? UT (Please explain how this patch was tested. Ex: unit tests, manual tests) (If this patch involves UI changes, please attach a screen-shot; otherwise, remove this) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343974) Time Spent: 1h 10m (was: 1h) > Generated chunk size name too long. > --- > > Key: HDDS-1492 > URL: https://issues.apache.org/jira/browse/HDDS-1492 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Aravindan Vijayan >Assignee: Shashikant Banerjee >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Following exception is seen in SCM logs intermittently. > {code} > java.lang.RuntimeException: file name > 'chunks/2a54b2a153f4a9c5da5f44e2c6f97c60_stream_9c6ac565-e2d4-469c-bd5c-47922a35e798_chunk_10.tmp.2.23115' > is too long ( > 100 bytes) > {code} > We may have to limit the name of the chunk to 100 bytes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly
[ https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974759#comment-16974759 ] Konstantin Shvachko commented on HDFS-14973: This is the Jenkins build that built the patch, but couldn't send the report to hits jira for some reason: https://builds.apache.org/job/PreCommit-HDFS-Build/28311/testReport/ > Balancer getBlocks RPC dispersal does not function properly > --- > > Key: HDFS-14973 > URL: https://issues.apache.org/jira/browse/HDFS-14973 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 2.9.0, 2.7.4, 2.8.2, 3.0.0 >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14973.000.patch, HDFS-14973.001.patch, > HDFS-14973.002.patch, HDFS-14973.003.patch, HDFS-14973.test.patch > > > In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls > issued by the Balancer/Mover more dispersed, to alleviate load on the > NameNode, since {{getBlocks}} can be very expensive and the Balancer should > not impact normal cluster operation. > Unfortunately, this functionality does not function as expected, especially > when the dispatcher thread count is low. The primary issue is that the delay > is applied only to the first N threads that are submitted to the dispatcher's > executor, where N is the size of the dispatcher's threadpool, but *not* to > the first R threads, where R is the number of allowed {{getBlocks}} QPS > (currently hardcoded to 20). For example, if the threadpool size is 100 (the > default), threads 0-19 have no delay, 20-99 have increased levels of delay, > and 100+ have no delay. As I understand it, the intent of the logic was that > the delay applied to the first 100 threads would force the dispatcher > executor's threads to all be consumed, thus blocking subsequent (non-delayed) > threads until the delay period has expired. However, threads 0-19 can finish > very quickly (their work can often be fulfilled in the time it takes to > execute a single {{getBlocks}} RPC, on the order of tens of milliseconds), > thus opening up 20 new slots in the executor, which are then consumed by > non-delayed threads 100-119, and so on. So, although 80 threads have had a > delay applied, the non-delay threads rush through in the 20 non-delay slots. > This problem gets even worse when the dispatcher threadpool size is less than > the max {{getBlocks}} QPS. For example, if the threadpool size is 10, _no > threads ever have a delay applied_, and the feature is not enabled at all. > This problem wasn't surfaced in the original JIRA because the test > incorrectly measured the period across which {{getBlocks}} RPCs were > distributed. The variables {{startGetBlocksTime}} and {{endGetBlocksTime}} > were used to track the time over which the {{getBlocks}} calls were made. > However, {{startGetBlocksTime}} was initialized at the time of creation of > the {{FSNameystem}} spy, which is before the mock DataNodes are started. Even > worse, the Balancer in this test takes 2 iterations to complete balancing the > cluster, so the time period {{endGetBlocksTime - startGetBlocksTime}} > actually represents: > {code} > (time to submit getBlocks RPCs) + (DataNode startup time) + (time for the > Dispatcher to complete an iteration of moving blocks) > {code} > Thus, the RPC QPS reported by the test is much lower than the RPC QPS seen > during the period of initial block fetching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2498) Sonar: Fix issues found in StorageContainerManager class
[ https://issues.apache.org/jira/browse/HDDS-2498?focusedWorklogId=343968=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343968 ] ASF GitHub Bot logged work on HDDS-2498: Author: ASF GitHub Bot Created on: 15/Nov/19 03:15 Start Date: 15/Nov/19 03:15 Worklog Time Spent: 10m Work Description: swagle commented on pull request #178: HDDS-2498. Fix sonar issues found in StorageContainerManager. URL: https://github.com/apache/hadoop-ozone/pull/178 ## What changes were proposed in this pull request? Fix all the issues except the TODO items. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2498 ## How was this patch tested? Only syntactic changes, waiting on UT and acceptance runs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343968) Remaining Estimate: 0h Time Spent: 10m > Sonar: Fix issues found in StorageContainerManager class > > > Key: HDDS-2498 > URL: https://issues.apache.org/jira/browse/HDDS-2498 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > https://sonarcloud.io/project/issues?fileUuids=AW5md-HfKcVY8lQ4ZrcG=hadoop-ozone=AW5md-tIKcVY8lQ4ZsEr=false -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2498) Sonar: Fix issues found in StorageContainerManager class
[ https://issues.apache.org/jira/browse/HDDS-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2498: - Labels: pull-request-available (was: ) > Sonar: Fix issues found in StorageContainerManager class > > > Key: HDDS-2498 > URL: https://issues.apache.org/jira/browse/HDDS-2498 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > > https://sonarcloud.io/project/issues?fileUuids=AW5md-HfKcVY8lQ4ZrcG=hadoop-ozone=AW5md-tIKcVY8lQ4ZsEr=false -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly
[ https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974756#comment-16974756 ] Konstantin Shvachko commented on HDFS-14973: +1 Good to go. > Balancer getBlocks RPC dispersal does not function properly > --- > > Key: HDFS-14973 > URL: https://issues.apache.org/jira/browse/HDFS-14973 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 2.9.0, 2.7.4, 2.8.2, 3.0.0 >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14973.000.patch, HDFS-14973.001.patch, > HDFS-14973.002.patch, HDFS-14973.003.patch, HDFS-14973.test.patch > > > In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls > issued by the Balancer/Mover more dispersed, to alleviate load on the > NameNode, since {{getBlocks}} can be very expensive and the Balancer should > not impact normal cluster operation. > Unfortunately, this functionality does not function as expected, especially > when the dispatcher thread count is low. The primary issue is that the delay > is applied only to the first N threads that are submitted to the dispatcher's > executor, where N is the size of the dispatcher's threadpool, but *not* to > the first R threads, where R is the number of allowed {{getBlocks}} QPS > (currently hardcoded to 20). For example, if the threadpool size is 100 (the > default), threads 0-19 have no delay, 20-99 have increased levels of delay, > and 100+ have no delay. As I understand it, the intent of the logic was that > the delay applied to the first 100 threads would force the dispatcher > executor's threads to all be consumed, thus blocking subsequent (non-delayed) > threads until the delay period has expired. However, threads 0-19 can finish > very quickly (their work can often be fulfilled in the time it takes to > execute a single {{getBlocks}} RPC, on the order of tens of milliseconds), > thus opening up 20 new slots in the executor, which are then consumed by > non-delayed threads 100-119, and so on. So, although 80 threads have had a > delay applied, the non-delay threads rush through in the 20 non-delay slots. > This problem gets even worse when the dispatcher threadpool size is less than > the max {{getBlocks}} QPS. For example, if the threadpool size is 10, _no > threads ever have a delay applied_, and the feature is not enabled at all. > This problem wasn't surfaced in the original JIRA because the test > incorrectly measured the period across which {{getBlocks}} RPCs were > distributed. The variables {{startGetBlocksTime}} and {{endGetBlocksTime}} > were used to track the time over which the {{getBlocks}} calls were made. > However, {{startGetBlocksTime}} was initialized at the time of creation of > the {{FSNameystem}} spy, which is before the mock DataNodes are started. Even > worse, the Balancer in this test takes 2 iterations to complete balancing the > cluster, so the time period {{endGetBlocksTime - startGetBlocksTime}} > actually represents: > {code} > (time to submit getBlocks RPCs) + (DataNode startup time) + (time for the > Dispatcher to complete an iteration of moving blocks) > {code} > Thus, the RPC QPS reported by the test is much lower than the RPC QPS seen > during the period of initial block fetching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2498) Sonar: Fix issues found in StorageContainerManager class
Siddharth Wagle created HDDS-2498: - Summary: Sonar: Fix issues found in StorageContainerManager class Key: HDDS-2498 URL: https://issues.apache.org/jira/browse/HDDS-2498 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Affects Versions: 0.5.0 Reporter: Siddharth Wagle Assignee: Siddharth Wagle Fix For: 0.5.0 https://sonarcloud.io/project/issues?fileUuids=AW5md-HfKcVY8lQ4ZrcG=hadoop-ozone=AW5md-tIKcVY8lQ4ZsEr=false -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2482) Enable github actions for pull requests
[ https://issues.apache.org/jira/browse/HDDS-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2482: - Labels: pull-request-available (was: ) > Enable github actions for pull requests > --- > > Key: HDDS-2482 > URL: https://issues.apache.org/jira/browse/HDDS-2482 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > > HDDS-2400 introduced a github actions workflow for each "push" event. It > turned out that pushing to a forked repository doesn't trigger this event > even if it's part of a PR. > > We need to enable the execution for pull_request events: > References: > > [https://github.community/t5/GitHub-Actions/Run-a-GitHub-action-on-pull-request-for-PR-opened-from-a-forked/m-p/31147#M690] > [https://help.github.com/en/actions/automating-your-workflow-with-github-actions/events-that-trigger-workflows#pull-request-events-for-forked-repositories] > {noformat} > Note: By default, a workflow only runs when a pull_request's activity type is > opened, synchronize, or reopened. To trigger workflows for more activity > types, use the types keyword.{noformat} > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2482) Enable github actions for pull requests
[ https://issues.apache.org/jira/browse/HDDS-2482?focusedWorklogId=343947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343947 ] ASF GitHub Bot logged work on HDDS-2482: Author: ASF GitHub Bot Created on: 15/Nov/19 02:18 Start Date: 15/Nov/19 02:18 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #171: HDDS-2482. Enable github actions for pull requests URL: https://github.com/apache/hadoop-ozone/pull/171 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343947) Remaining Estimate: 0h Time Spent: 10m > Enable github actions for pull requests > --- > > Key: HDDS-2482 > URL: https://issues.apache.org/jira/browse/HDDS-2482 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HDDS-2400 introduced a github actions workflow for each "push" event. It > turned out that pushing to a forked repository doesn't trigger this event > even if it's part of a PR. > > We need to enable the execution for pull_request events: > References: > > [https://github.community/t5/GitHub-Actions/Run-a-GitHub-action-on-pull-request-for-PR-opened-from-a-forked/m-p/31147#M690] > [https://help.github.com/en/actions/automating-your-workflow-with-github-actions/events-that-trigger-workflows#pull-request-events-for-forked-repositories] > {noformat} > Note: By default, a workflow only runs when a pull_request's activity type is > opened, synchronize, or reopened. To trigger workflows for more activity > types, use the types keyword.{noformat} > > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14983) RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option
[ https://issues.apache.org/jira/browse/HDFS-14983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xieming Li reassigned HDFS-14983: - Assignee: Xieming Li > RBF: Add dfsrouteradmin -refreshSuperUserGroupsConfiguration command option > --- > > Key: HDFS-14983 > URL: https://issues.apache.org/jira/browse/HDFS-14983 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Akira Ajisaka >Assignee: Xieming Li >Priority: Minor > > NameNode can update proxyuser config by -refreshSuperUserGroupsConfiguration > without restarting but DFSRouter cannot. It would be better for DFSRouter to > have such functionality to be compatible with NameNode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2034) Async RATIS pipeline creation and destroy through heartbeat commands
[ https://issues.apache.org/jira/browse/HDDS-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoyu Yao updated HDDS-2034: - Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks [~Sammi] for the contribution and all for the reviews. I've merged the change to master. > Async RATIS pipeline creation and destroy through heartbeat commands > > > Key: HDDS-2034 > URL: https://issues.apache.org/jira/browse/HDDS-2034 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Sammi Chen >Assignee: Sammi Chen >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 13h > Remaining Estimate: 0h > > Currently, pipeline creation and destroy are synchronous operations. SCM > directly connect to each datanode of the pipeline through gRPC channel to > create the pipeline to destroy the pipeline. > This task is to remove the gRPC channel, send pipeline creation and destroy > action through heartbeat command to each datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2034) Async RATIS pipeline creation and destroy through heartbeat commands
[ https://issues.apache.org/jira/browse/HDDS-2034?focusedWorklogId=343922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343922 ] ASF GitHub Bot logged work on HDDS-2034: Author: ASF GitHub Bot Created on: 15/Nov/19 01:06 Start Date: 15/Nov/19 01:06 Worklog Time Spent: 10m Work Description: xiaoyuyao commented on pull request #29: HDDS-2034. Async RATIS pipeline creation and destroy through heartbeat commands URL: https://github.com/apache/hadoop-ozone/pull/29 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343922) Time Spent: 13h (was: 12h 50m) > Async RATIS pipeline creation and destroy through heartbeat commands > > > Key: HDDS-2034 > URL: https://issues.apache.org/jira/browse/HDDS-2034 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Sammi Chen >Assignee: Sammi Chen >Priority: Major > Labels: pull-request-available > Time Spent: 13h > Remaining Estimate: 0h > > Currently, pipeline creation and destroy are synchronous operations. SCM > directly connect to each datanode of the pipeline through gRPC channel to > create the pipeline to destroy the pipeline. > This task is to remove the gRPC channel, send pipeline creation and destroy > action through heartbeat command to each datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2497) SafeMode check should allow key creation on single node pipeline when replication factor is 1
Xiaoyu Yao created HDDS-2497: Summary: SafeMode check should allow key creation on single node pipeline when replication factor is 1 Key: HDDS-2497 URL: https://issues.apache.org/jira/browse/HDDS-2497 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Xiaoyu Yao Start a single datanode ozone docker-compose with replication factor of 1. {code:java} OZONE-SITE.XML_ozone.replication=1{code} The key creation failed with Safemode exception below. {code:java} >$ docker-compose exec om bash bash-4.2$ ozone sh vol create /vol1 bash-4.2$ ozone sh bucket create /vol1/bucket1 ozone sh kbash-4.2$ ozone sh key put /vol1/bucket1/key1 README.md SCM_IN_SAFE_MODE SafeModePrecheck failed for allocateBlock{code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2442) Add ServiceName support for getting Signed Cert.
[ https://issues.apache.org/jira/browse/HDDS-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2442: - Labels: pull-request-available (was: ) > Add ServiceName support for getting Signed Cert. > > > Key: HDDS-2442 > URL: https://issues.apache.org/jira/browse/HDDS-2442 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Anu Engineer >Assignee: Abhishek Purohit >Priority: Major > Labels: pull-request-available > > We need to add support for adding Service name into the Certificate Signing > Request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2442) Add ServiceName support for getting Signed Cert.
[ https://issues.apache.org/jira/browse/HDDS-2442?focusedWorklogId=343874=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343874 ] ASF GitHub Bot logged work on HDDS-2442: Author: ASF GitHub Bot Created on: 14/Nov/19 23:41 Start Date: 14/Nov/19 23:41 Worklog Time Spent: 10m Work Description: abhishekaypurohit commented on pull request #177: HDDS-2442. Added support for service name in OM for CSR URL: https://github.com/apache/hadoop-ozone/pull/177 ## What changes were proposed in this pull request? The SCM HA needs the ability to represent a group as a single entity. So that Tokens for each of the OM which is part of an HA group can be honored by the data nodes. This patch adds service name support for CSR in OzomeManager. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2442 ## How was this patch tested? Updated corresponding unit tests and tested them by running those tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343874) Remaining Estimate: 0h Time Spent: 10m > Add ServiceName support for getting Signed Cert. > > > Key: HDDS-2442 > URL: https://issues.apache.org/jira/browse/HDDS-2442 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Anu Engineer >Assignee: Abhishek Purohit >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > We need to add support for adding Service name into the Certificate Signing > Request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2495) Sonar - "notify" may not wake up the appropriate thread
[ https://issues.apache.org/jira/browse/HDDS-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2495: - Labels: pull-request-available sonar (was: sonar) > Sonar - "notify" may not wake up the appropriate thread > --- > > Key: HDDS-2495 > URL: https://issues.apache.org/jira/browse/HDDS-2495 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Matthew Sharp >Assignee: Matthew Sharp >Priority: Minor > Labels: pull-request-available, sonar > > Addresses same issue within ReplicationManager: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDi=AW5md-sVKcVY8lQ4ZsDi] > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDh=AW5md-sVKcVY8lQ4ZsDh] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2495) Sonar - "notify" may not wake up the appropriate thread
[ https://issues.apache.org/jira/browse/HDDS-2495?focusedWorklogId=343817=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343817 ] ASF GitHub Bot logged work on HDDS-2495: Author: ASF GitHub Bot Created on: 14/Nov/19 22:54 Start Date: 14/Nov/19 22:54 Worklog Time Spent: 10m Work Description: mbsharp commented on pull request #176: HDDS-2495 Sonar "notify" may not wake up the appropriate thread URL: https://github.com/apache/hadoop-ozone/pull/176 ## What changes were proposed in this pull request? https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDi=AW5md-sVKcVY8lQ4ZsDi https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDh=AW5md-sVKcVY8lQ4ZsDh ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2495 ## How was this patch tested? Existing unit tests and clean local build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343817) Remaining Estimate: 0h Time Spent: 10m > Sonar - "notify" may not wake up the appropriate thread > --- > > Key: HDDS-2495 > URL: https://issues.apache.org/jira/browse/HDDS-2495 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Matthew Sharp >Assignee: Matthew Sharp >Priority: Minor > Labels: pull-request-available, sonar > Time Spent: 10m > Remaining Estimate: 0h > > Addresses same issue within ReplicationManager: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDi=AW5md-sVKcVY8lQ4ZsDi] > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDh=AW5md-sVKcVY8lQ4ZsDh] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2493) Sonar: Locking on a parameter in NetUtils.removeOutscope
[ https://issues.apache.org/jira/browse/HDDS-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-2493: --- Labels: pull-request-available sonar (was: pull-request-available) > Sonar: Locking on a parameter in NetUtils.removeOutscope > > > Key: HDDS-2493 > URL: https://issues.apache.org/jira/browse/HDDS-2493 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-2hKcVY8lQ4ZsNd=false=BUG -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2495) Sonar - "notify" may not wake up the appropriate thread
[ https://issues.apache.org/jira/browse/HDDS-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-2495: --- Labels: sonar (was: ) > Sonar - "notify" may not wake up the appropriate thread > --- > > Key: HDDS-2495 > URL: https://issues.apache.org/jira/browse/HDDS-2495 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Matthew Sharp >Assignee: Matthew Sharp >Priority: Minor > Labels: sonar > > Addresses same issue within ReplicationManager: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDi=AW5md-sVKcVY8lQ4ZsDi] > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDh=AW5md-sVKcVY8lQ4ZsDh] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2494) Sonar - BigDecimal(double) should not be used
[ https://issues.apache.org/jira/browse/HDDS-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-2494: --- Labels: pull-request-available sonar (was: pull-request-available) > Sonar - BigDecimal(double) should not be used > - > > Key: HDDS-2494 > URL: https://issues.apache.org/jira/browse/HDDS-2494 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Matthew Sharp >Assignee: Matthew Sharp >Priority: Minor > Labels: pull-request-available, sonar > Time Spent: 10m > Remaining Estimate: 0h > > Sonar Issue: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-0AKcVY8lQ4ZsKR=AW5md-0AKcVY8lQ4ZsKR] > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2496) Delegate Ozone volume create/list ACL check to authorizer plugin
Xiaoyu Yao created HDDS-2496: Summary: Delegate Ozone volume create/list ACL check to authorizer plugin Key: HDDS-2496 URL: https://issues.apache.org/jira/browse/HDDS-2496 Project: Hadoop Distributed Data Store Issue Type: Improvement Affects Versions: 0.4.1 Reporter: Vivek Ratnavel Subramanian Assignee: Vivek Ratnavel Subramanian Today Ozone volume create/list ACL check are not sent to authorization plugins. This cause problem when authorization plugin is enabled. Admin still need to modify ozone-site.xml to change ozone.administrators to configure admin to create volume This ticket is opened to have a consistent ACL check for all Ozone resources requests including admin request like volume create. This way, the admin defined by the authorization plugin can be honored during volume provision without restart ozone services. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2495) Sonar - "notify" may not wake up the appropriate thread
[ https://issues.apache.org/jira/browse/HDDS-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Sharp updated HDDS-2495: Description: Addresses same issue within ReplicationManager: [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDi=AW5md-sVKcVY8lQ4ZsDi] [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDh=AW5md-sVKcVY8lQ4ZsDh] was: [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDi=AW5md-sVKcVY8lQ4ZsDi] [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDh=AW5md-sVKcVY8lQ4ZsDh] > Sonar - "notify" may not wake up the appropriate thread > --- > > Key: HDDS-2495 > URL: https://issues.apache.org/jira/browse/HDDS-2495 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Matthew Sharp >Assignee: Matthew Sharp >Priority: Minor > > Addresses same issue within ReplicationManager: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDi=AW5md-sVKcVY8lQ4ZsDi] > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDh=AW5md-sVKcVY8lQ4ZsDh] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2495) Sonar - "notify" may not wake up the appropriate thread
Matthew Sharp created HDDS-2495: --- Summary: Sonar - "notify" may not wake up the appropriate thread Key: HDDS-2495 URL: https://issues.apache.org/jira/browse/HDDS-2495 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Matthew Sharp Assignee: Matthew Sharp [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDi=AW5md-sVKcVY8lQ4ZsDi] [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-sVKcVY8lQ4ZsDh=AW5md-sVKcVY8lQ4ZsDh] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.
[ https://issues.apache.org/jira/browse/HDFS-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDFS-14989: - Description: Borrowing from the design doc. bq. The swapBlockList takes two parameters, a source file and a destination file. This operation swaps the blocks belonging to the source and the destination atomically. bq. The namespace metadata of interest is the INodeFile class. A file (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile contains a list of blocks (BlockInfo[]). The operation will swap BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not touch other fields. To avoid complication, this operation will abort if either file is open (isUnderConstruction() == true) bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST to record the change persistently. was: Borrowing from the design doc. bq. The swapBlockList takes two parameters, a source file and a destination file. This operation swaps the blocks belonging to the source and the destination atomically. The namespace metadata of interest is the INodeFile class. A file (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile contains a list of blocks (BlockInfo[]). The operation will swap BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not touch other fields. To avoid complication, this operation will abort if either file is open (isUnderConstruction() == true) Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST to record the change persistently. > Add a 'swapBlockList' operation to Namenode. > > > Key: HDFS-14989 > URL: https://issues.apache.org/jira/browse/HDFS-14989 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Borrowing from the design doc. > bq. The swapBlockList takes two parameters, a source file and a destination > file. This operation swaps the blocks belonging to the source and the > destination atomically. > bq. The namespace metadata of interest is the INodeFile class. A file > (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, > BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile > contains a list of blocks (BlockInfo[]). The operation will swap > BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not > touch other fields. To avoid complication, this operation will abort if > either file is open (isUnderConstruction() == true) > bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST > to record the change persistently. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.
Aravindan Vijayan created HDFS-14989: Summary: Add a 'swapBlockList' operation to Namenode. Key: HDFS-14989 URL: https://issues.apache.org/jira/browse/HDFS-14989 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Borrowing from the design doc. bq. The swapBlockList takes two parameters, a source file and a destination file. This operation swaps the blocks belonging to the source and the destination atomically. The namespace metadata of interest is the INodeFile class. A file (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile contains a list of blocks (BlockInfo[]). The operation will swap BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not touch other fields. To avoid complication, this operation will abort if either file is open (isUnderConstruction() == true) Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST to record the change persistently. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2494) Sonar - BigDecimal(double) should not be used
[ https://issues.apache.org/jira/browse/HDDS-2494?focusedWorklogId=343773=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343773 ] ASF GitHub Bot logged work on HDDS-2494: Author: ASF GitHub Bot Created on: 14/Nov/19 21:57 Start Date: 14/Nov/19 21:57 Worklog Time Spent: 10m Work Description: mbsharp commented on pull request #175: HDDS-2494 Sonar BigDecimal Cleanup URL: https://github.com/apache/hadoop-ozone/pull/175 ## What changes were proposed in this pull request? https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-0AKcVY8lQ4ZsKR=AW5md-0AKcVY8lQ4ZsKR ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2494 ## How was this patch tested? Existing unit tests and clean local build This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343773) Remaining Estimate: 0h Time Spent: 10m > Sonar - BigDecimal(double) should not be used > - > > Key: HDDS-2494 > URL: https://issues.apache.org/jira/browse/HDDS-2494 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Matthew Sharp >Assignee: Matthew Sharp >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Sonar Issue: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-0AKcVY8lQ4ZsKR=AW5md-0AKcVY8lQ4ZsKR] > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2494) Sonar - BigDecimal(double) should not be used
[ https://issues.apache.org/jira/browse/HDDS-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2494: - Labels: pull-request-available (was: ) > Sonar - BigDecimal(double) should not be used > - > > Key: HDDS-2494 > URL: https://issues.apache.org/jira/browse/HDDS-2494 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Matthew Sharp >Assignee: Matthew Sharp >Priority: Minor > Labels: pull-request-available > > Sonar Issue: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-0AKcVY8lQ4ZsKR=AW5md-0AKcVY8lQ4ZsKR] > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDDS-2494) Sonar - BigDecimal(double) should not be used
[ https://issues.apache.org/jira/browse/HDDS-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDDS-2494 started by Matthew Sharp. --- > Sonar - BigDecimal(double) should not be used > - > > Key: HDDS-2494 > URL: https://issues.apache.org/jira/browse/HDDS-2494 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Matthew Sharp >Assignee: Matthew Sharp >Priority: Minor > > Sonar Issue: > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-0AKcVY8lQ4ZsKR=AW5md-0AKcVY8lQ4ZsKR] > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2494) Sonar - BigDecimal(double) should not be used
Matthew Sharp created HDDS-2494: --- Summary: Sonar - BigDecimal(double) should not be used Key: HDDS-2494 URL: https://issues.apache.org/jira/browse/HDDS-2494 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Matthew Sharp Assignee: Matthew Sharp Sonar Issue: [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-0AKcVY8lQ4ZsKR=AW5md-0AKcVY8lQ4ZsKR] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2493) Sonar: Locking on a parameter in NetUtils.removeOutscope
[ https://issues.apache.org/jira/browse/HDDS-2493?focusedWorklogId=343756=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343756 ] ASF GitHub Bot logged work on HDDS-2493: Author: ASF GitHub Bot Created on: 14/Nov/19 21:26 Start Date: 14/Nov/19 21:26 Worklog Time Spent: 10m Work Description: swagle commented on pull request #174: HDDS-2493. Sonar: Locking on a parameter in NetUtils.removeOutscope. URL: https://github.com/apache/hadoop-ozone/pull/174 ## What changes were proposed in this pull request? Since the parameter reference can change, the locking should be on a shared object that is final. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2493 ## How was this patch tested? This does not change the logic around remove out of scope, waiting for unit and acceptance test results to analyze. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343756) Remaining Estimate: 0h Time Spent: 10m > Sonar: Locking on a parameter in NetUtils.removeOutscope > > > Key: HDDS-2493 > URL: https://issues.apache.org/jira/browse/HDDS-2493 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-2hKcVY8lQ4ZsNd=false=BUG -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2493) Sonar: Locking on a parameter in NetUtils.removeOutscope
[ https://issues.apache.org/jira/browse/HDDS-2493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2493: - Labels: pull-request-available (was: ) > Sonar: Locking on a parameter in NetUtils.removeOutscope > > > Key: HDDS-2493 > URL: https://issues.apache.org/jira/browse/HDDS-2493 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-2hKcVY8lQ4ZsNd=false=BUG -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2493) Sonar: Locking on a parameter in NetUtils.removeOutscope
Siddharth Wagle created HDDS-2493: - Summary: Sonar: Locking on a parameter in NetUtils.removeOutscope Key: HDDS-2493 URL: https://issues.apache.org/jira/browse/HDDS-2493 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Affects Versions: 0.5.0 Reporter: Siddharth Wagle Assignee: Siddharth Wagle Fix For: 0.5.0 https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-2hKcVY8lQ4ZsNd=false=BUG -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDDS-2455) Implement MiniOzoneHAClusterImpl#getOMLeader
[ https://issues.apache.org/jira/browse/HDDS-2455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDDS-2455 started by Siyao Meng. > Implement MiniOzoneHAClusterImpl#getOMLeader > > > Key: HDDS-2455 > URL: https://issues.apache.org/jira/browse/HDDS-2455 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Implement MiniOzoneHAClusterImpl#getOMLeader and use it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2487) Ensure streams are closed
[ https://issues.apache.org/jira/browse/HDDS-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-2487: --- Status: Patch Available (was: In Progress) > Ensure streams are closed > - > > Key: HDDS-2487 > URL: https://issues.apache.org/jira/browse/HDDS-2487 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available, sonar > Time Spent: 10m > Remaining Estimate: 0h > > * ContainerDataYaml: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-6IKcVY8lQ4ZsQU=AW5md-6IKcVY8lQ4ZsQU > * OmUtils: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-hdKcVY8lQ4Zr76=AW5md-hdKcVY8lQ4Zr76 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2487) Ensure streams are closed
[ https://issues.apache.org/jira/browse/HDDS-2487?focusedWorklogId=343705=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343705 ] ASF GitHub Bot logged work on HDDS-2487: Author: ASF GitHub Bot Created on: 14/Nov/19 19:46 Start Date: 14/Nov/19 19:46 Worklog Time Spent: 10m Work Description: adoroszlai commented on pull request #173: HDDS-2487. Ensure streams are closed URL: https://github.com/apache/hadoop-ozone/pull/173 ## What changes were proposed in this pull request? Fix two stream close-related issues spotted by Sonar. https://issues.apache.org/jira/browse/HDDS-2487 ## How was this patch tested? Ran related unit tests (`TestContainerDataYaml` and `TestOmUtils`). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343705) Remaining Estimate: 0h Time Spent: 10m > Ensure streams are closed > - > > Key: HDDS-2487 > URL: https://issues.apache.org/jira/browse/HDDS-2487 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available, sonar > Time Spent: 10m > Remaining Estimate: 0h > > * ContainerDataYaml: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-6IKcVY8lQ4ZsQU=AW5md-6IKcVY8lQ4ZsQU > * OmUtils: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-hdKcVY8lQ4Zr76=AW5md-hdKcVY8lQ4Zr76 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2487) Ensure streams are closed
[ https://issues.apache.org/jira/browse/HDDS-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2487: - Labels: pull-request-available sonar (was: sonar) > Ensure streams are closed > - > > Key: HDDS-2487 > URL: https://issues.apache.org/jira/browse/HDDS-2487 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available, sonar > > * ContainerDataYaml: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-6IKcVY8lQ4ZsQU=AW5md-6IKcVY8lQ4ZsQU > * OmUtils: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-hdKcVY8lQ4Zr76=AW5md-hdKcVY8lQ4Zr76 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDDS-2487) Ensure streams are closed
[ https://issues.apache.org/jira/browse/HDDS-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDDS-2487 started by Attila Doroszlai. -- > Ensure streams are closed > - > > Key: HDDS-2487 > URL: https://issues.apache.org/jira/browse/HDDS-2487 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: sonar > > * ContainerDataYaml: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-6IKcVY8lQ4ZsQU=AW5md-6IKcVY8lQ4ZsQU > * OmUtils: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-hdKcVY8lQ4Zr76=AW5md-hdKcVY8lQ4Zr76 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2492) Fix test clean up issue in TestSCMPipelineManager
Xiaoyu Yao created HDDS-2492: Summary: Fix test clean up issue in TestSCMPipelineManager Key: HDDS-2492 URL: https://issues.apache.org/jira/browse/HDDS-2492 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Sammi Chen Assignee: Li Cheng This was opened based on [~sammichen]'s investigation on HDDS-2034. {quote}Failure is caused by newly introduced function TestSCMPipelineManager#testPipelineOpenOnlyWhenLeaderReported which doesn't close pipelineManager at the end. It's better to fix it in a new JIRA. {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2491) Fix TestSCMSafeModeWithPipelineRules
Xiaoyu Yao created HDDS-2491: Summary: Fix TestSCMSafeModeWithPipelineRules Key: HDDS-2491 URL: https://issues.apache.org/jira/browse/HDDS-2491 Project: Hadoop Distributed Data Store Issue Type: Bug Affects Versions: 0.4.1 Reporter: Xiaoyu Yao This was based on [~sammichen]'s investigation on HDDS-2034. {quote}The root cause is failing to exit the safemode. Current pipeline open condition(HDDS-1868) is got 3 datanode reports and one datanode marked itself as leader. In this failure case, the leader election succeeds while XceiverServerRatis#handleLeaderChangedNotification is not called in the next 3 minutes. So cluster.waitForClusterToBeReady() timeout. The question is is this Leader change notification reliable? What's the expected latency between leader election succeed and notification send? {quote} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2489) Sonar: Anonymous class based initialization in HddsClientUtils
[ https://issues.apache.org/jira/browse/HDDS-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HDDS-2489: - Labels: pull-request-available sonar (was: sonar) > Sonar: Anonymous class based initialization in HddsClientUtils > -- > > Key: HDDS-2489 > URL: https://issues.apache.org/jira/browse/HDDS-2489 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM Client >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_APKcVY8lQ4ZsWN=false=BUG -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2489) Sonar: Anonymous class based initialization in HddsClientUtils
[ https://issues.apache.org/jira/browse/HDDS-2489?focusedWorklogId=343677=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343677 ] ASF GitHub Bot logged work on HDDS-2489: Author: ASF GitHub Bot Created on: 14/Nov/19 19:09 Start Date: 14/Nov/19 19:09 Worklog Time Spent: 10m Work Description: swagle commented on pull request #172: HDDS-2489. Change anonymous class based initialization in HddsUtils. URL: https://github.com/apache/hadoop-ozone/pull/172 ## What changes were proposed in this pull request? Change anonymous class based initialization to sonar rule compatible way. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-2489 ## How was this patch tested? Simple change verified the build succeeds and TesteHddsClientUtils passed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343677) Remaining Estimate: 0h Time Spent: 10m > Sonar: Anonymous class based initialization in HddsClientUtils > -- > > Key: HDDS-2489 > URL: https://issues.apache.org/jira/browse/HDDS-2489 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM Client >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_APKcVY8lQ4ZsWN=false=BUG -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2442) Add ServiceName support for for getting Signed Cert.
[ https://issues.apache.org/jira/browse/HDDS-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-2442: --- Summary: Add ServiceName support for for getting Signed Cert. (was: Add ServiceName support for for getting Singed Cert.) > Add ServiceName support for for getting Signed Cert. > > > Key: HDDS-2442 > URL: https://issues.apache.org/jira/browse/HDDS-2442 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Anu Engineer >Assignee: Abhishek Purohit >Priority: Major > > We need to add support for adding Service name into the Certificate Signing > Request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2442) Add ServiceName support for getting Signed Cert.
[ https://issues.apache.org/jira/browse/HDDS-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-2442: --- Summary: Add ServiceName support for getting Signed Cert. (was: Add ServiceName support for for getting Signed Cert.) > Add ServiceName support for getting Signed Cert. > > > Key: HDDS-2442 > URL: https://issues.apache.org/jira/browse/HDDS-2442 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Anu Engineer >Assignee: Abhishek Purohit >Priority: Major > > We need to add support for adding Service name into the Certificate Signing > Request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2489) Sonar: Anonymous class based initialization in HddsClientUtils
[ https://issues.apache.org/jira/browse/HDDS-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-2489: --- Labels: sonar (was: ) > Sonar: Anonymous class based initialization in HddsClientUtils > -- > > Key: HDDS-2489 > URL: https://issues.apache.org/jira/browse/HDDS-2489 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM Client >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Labels: sonar > Fix For: 0.5.0 > > > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_APKcVY8lQ4ZsWN=false=BUG -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2490) Ensure OzoneClient is closed in Ozone Shell handlers
Attila Doroszlai created HDDS-2490: -- Summary: Ensure OzoneClient is closed in Ozone Shell handlers Key: HDDS-2490 URL: https://issues.apache.org/jira/browse/HDDS-2490 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone CLI Reporter: Attila Doroszlai OzoneClient should be closed in all command handlers ({{Handler}} subclasses). * https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-Y0KcVY8lQ4Zrz6=AW5md-Y0KcVY8lQ4Zrz6 * https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-ZGKcVY8lQ4Zr0b=AW5md-ZGKcVY8lQ4Zr0b etc. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2442) Add ServiceName support for for getting Singed Cert.
[ https://issues.apache.org/jira/browse/HDDS-2442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Purohit updated HDDS-2442: --- Summary: Add ServiceName support for for getting Singed Cert. (was: Add ServiceName support for Certificate Signing Request.) > Add ServiceName support for for getting Singed Cert. > > > Key: HDDS-2442 > URL: https://issues.apache.org/jira/browse/HDDS-2442 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Reporter: Anu Engineer >Assignee: Abhishek Purohit >Priority: Major > > We need to add support for adding Service name into the Certificate Signing > Request. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2489) Sonar: Anonymous class based initialization in HddsClientUtils
Siddharth Wagle created HDDS-2489: - Summary: Sonar: Anonymous class based initialization in HddsClientUtils Key: HDDS-2489 URL: https://issues.apache.org/jira/browse/HDDS-2489 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Client Affects Versions: 0.5.0 Reporter: Siddharth Wagle Assignee: Siddharth Wagle Fix For: 0.5.0 https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_APKcVY8lQ4ZsWN=false=BUG -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2488) Not enough arguments for log messages in GrpcXceiverService
Attila Doroszlai created HDDS-2488: -- Summary: Not enough arguments for log messages in GrpcXceiverService Key: HDDS-2488 URL: https://issues.apache.org/jira/browse/HDDS-2488 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Attila Doroszlai GrpcXceiverService has log messages with too few arguments for placeholders. Only one of them is flagged by Sonar, but all seem to have the same problem. https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-69KcVY8lQ4ZsRZ=AW5md-69KcVY8lQ4ZsRZ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2487) Ensure streams are closed
Attila Doroszlai created HDDS-2487: -- Summary: Ensure streams are closed Key: HDDS-2487 URL: https://issues.apache.org/jira/browse/HDDS-2487 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Attila Doroszlai Assignee: Attila Doroszlai * ContainerDataYaml: https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-6IKcVY8lQ4ZsQU=AW5md-6IKcVY8lQ4ZsQU * OmUtils: https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-hdKcVY8lQ4Zr76=AW5md-hdKcVY8lQ4Zr76 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2486) Empty test method in TestRDBTableStore
[ https://issues.apache.org/jira/browse/HDDS-2486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HDDS-2486: --- Description: {{TestRDBTableStore#toIOException}} is empty. https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-5kKcVY8lQ4ZsQH=AW5md-5kKcVY8lQ4ZsQH Also {{TestTypedRDBTableStore#toIOException}}: https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-5qKcVY8lQ4ZsQJ=AW5md-5qKcVY8lQ4ZsQJ was: {{TestRDBTableStore#toIOException}} is empty. https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-5kKcVY8lQ4ZsQH=AW5md-5kKcVY8lQ4ZsQH > Empty test method in TestRDBTableStore > -- > > Key: HDDS-2486 > URL: https://issues.apache.org/jira/browse/HDDS-2486 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Priority: Minor > Labels: sonar > > {{TestRDBTableStore#toIOException}} is empty. > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-5kKcVY8lQ4ZsQH=AW5md-5kKcVY8lQ4ZsQH > Also {{TestTypedRDBTableStore#toIOException}}: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-5qKcVY8lQ4ZsQJ=AW5md-5qKcVY8lQ4ZsQJ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2486) Empty test method in TestRDBTableStore
Attila Doroszlai created HDDS-2486: -- Summary: Empty test method in TestRDBTableStore Key: HDDS-2486 URL: https://issues.apache.org/jira/browse/HDDS-2486 Project: Hadoop Distributed Data Store Issue Type: Bug Components: test Reporter: Attila Doroszlai {{TestRDBTableStore#toIOException}} is empty. https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-5kKcVY8lQ4ZsQH=AW5md-5kKcVY8lQ4ZsQH -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2485) Disable XML external entity processing
Attila Doroszlai created HDDS-2485: -- Summary: Disable XML external entity processing Key: HDDS-2485 URL: https://issues.apache.org/jira/browse/HDDS-2485 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Attila Doroszlai Disable XML external entity processing in * NodeSchemaLoader: https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-2nKcVY8lQ4ZsNm=AW5md-2nKcVY8lQ4ZsNm * ConfigFileAppender: https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-_uKcVY8lQ4ZsVY=AW5md-_uKcVY8lQ4ZsVY -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14924) RenameSnapshot not updating new modification time
[ https://issues.apache.org/jira/browse/HDFS-14924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974497#comment-16974497 ] Íñigo Goiri commented on HDFS-14924: I think all the unit test failures are unrelated but it would be worth double checking. > RenameSnapshot not updating new modification time > - > > Key: HDFS-14924 > URL: https://issues.apache.org/jira/browse/HDFS-14924 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: hemanthboyina >Assignee: hemanthboyina >Priority: Major > Attachments: HDFS-14924.001.patch, HDFS-14924.002.patch > > > RenameSnapshot doesnt updating modification time -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2484) Ozone Manager - New Metrics for Trash Key Lists and Fails
Matthew Sharp created HDDS-2484: --- Summary: Ozone Manager - New Metrics for Trash Key Lists and Fails Key: HDDS-2484 URL: https://issues.apache.org/jira/browse/HDDS-2484 Project: Hadoop Distributed Data Store Issue Type: Sub-task Reporter: Matthew Sharp Assignee: Matthew Sharp We should add new metrics to track trash key lists and fails to OMMetrics Naming recommendations: NumTrashKeyLists, NumTrashKeyListFails -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2459) Refactor ReplicationManager to consider maintenance states
[ https://issues.apache.org/jira/browse/HDDS-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974366#comment-16974366 ] Stephen O'Donnell edited comment on HDDS-2459 at 11/14/19 5:49 PM: --- In the decommission design doc, we had an algorithm to determine the number of replicas that need to be created or destroy so a container can be perfectly replicated. The algorithm was: {code} /** * Calculate the number of the missing replicas. * * @return the number of the missing replicas. If it's less than zero, the container is over replicated. */ int getReplicationCount(int expectedCount, int healthy, int maintenance, int inFlight) { //for over replication, count only with the healthy replicas if (expectedCount < healthy) { return expectedCount - healthy; } replicaCount = expectedCount - (healthy + maintenance + inFlight); if (replicaCount == 0 && healthy < 1) { replicaCount ++; } //over replication is already handled return Math.max(0, replicaCount); } {code} The code from the design doc needs a minor correction to handle inflight deletes on over replication, and also handling replication factor 1 containers, so it would look like this: {code} public int additionalReplicaNeeded2() { if (repFactor < healthyCount) { return repFactor - healthyCount + inFlightDel; } int delta = repFactor - (healthyCount + maintenanceCount + inFlightAdd - inFlightDel); if (delta == 0 && healthyCount < minHealthyForMaintenance) { delta += Math.min(repFactor, minHealthyForMaintenance) - healthyCount; } return Math.max(0, delta); } {code} I also came up with the logic below, which is very similar although a little more verbose. The only different between the above and the below, is that in the case of 3 in_service replicas and one or more inflight deletes, the above will return 1 new replica needed, but the below will return zero. The reasoning is that we should let the delete complete or not, as it may fail, and then deal with the over or under replication when the inflight operations have cleared. There is also a bug in the above if there is 1 IN_SERVICE and 3 MAINTENANCE and minHealthy = 2. In this case the logic returns zero rather than the intended 1. This scenario could come about if there are 3 hosts put in maintenance and then 1 new replica gets created. {code} /** * Calculates the the delta of replicas which need to be created or removed * to ensure the container is correctly replicated. * * Decisions around over-replication are made only on healthy replicas, * ignoring any in maintenance and also any inflight adds. InFlight adds are * ignored, as they may not complete, so if we have: * * H, H, H, IN_FLIGHT_ADD * * And then schedule a delete, we could end up under-replicated (add fails, * delete completes). It is better to let the inflight operations complete * and then deal with any further over or under replication. * * For maintenance replicas, assuming replication factor 3, and minHealthy * 2, it is possible for all 3 hosts to be put into maintenance, leaving the * following (H = healthy, M = maintenance): * * H, H, M, M, M * * Even though we are tracking 5 replicas, this is not over replicated as we * ignore the maintenance copies. Later, the replicas could look like: * * H, H, H, H, M * * At this stage, the container is over replicated by 1, so one replica can be * removed. * * For containers which have replication factor healthy replica, we ignore any * inflight add or deletes, as they may fail. Instead, wait for them to * complete and then deal with any excess or deficit. * * For under replicated containers we do consider inflight add and delete to * avoid scheduling more adds than needed. There is additional logic around * containers with maintenance replica to ensure minHealthyForMaintenance * replia are maintained/ * * @return Delta of replicas needed. Negative indicates over replication and * containers should be removed. Positive indicates over replication * and zero indicates the containers has replicationFactor healthy * replica */ public int additionalReplicaNeeded() { int delta = repFactor - healthyCount; if (delta < 0) { // Over replicated, so may need to remove a block. Do not consider // inFlightAdds, as they may fail, but do consider inFlightDel which // will reduce the over-replication if it completes. return delta + inFlightDel; } else if (delta > 0) { // May be under-replicated, depending on maintenance. When a container is // under-replicated, we must consider inflight add and delete when // calculating the new containers needed. delta = Math.max(0, delta -
[jira] [Comment Edited] (HDDS-2459) Refactor ReplicationManager to consider maintenance states
[ https://issues.apache.org/jira/browse/HDDS-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974366#comment-16974366 ] Stephen O'Donnell edited comment on HDDS-2459 at 11/14/19 5:42 PM: --- In the decommission design doc, we had an algorithm to determine the number of replicas that need to be created or destroy so a container can be perfectly replicated. The algorithm was: {code} /** * Calculate the number of the missing replicas. * * @return the number of the missing replicas. If it's less than zero, the container is over replicated. */ int getReplicationCount(int expectedCount, int healthy, int maintenance, int inFlight) { //for over replication, count only with the healthy replicas if (expectedCount < healthy) { return expectedCount - healthy; } replicaCount = expectedCount - (healthy + maintenance + inFlight); if (replicaCount == 0 && healthy < 1) { replicaCount ++; } //over replication is already handled return Math.max(0, replicaCount); } {code} The code from the design doc needs a minor correction to handle inflight deletes on over replication, and also handling replication factor 1 containers, so it would look like this: {code} public int additionalReplicaNeeded2() { if (repFactor < healthyCount) { return repFactor - healthyCount + inFlightDel; } int delta = repFactor - (healthyCount + maintenanceCount + inFlightAdd - inFlightDel); if (delta == 0 && healthyCount < minHealthyForMaintenance) { delta += Math.min(repFactor, minHealthyForMaintenance) - healthyCount; } return Math.max(0, delta); } {code} I also came up with the logic below, which is very similar although a little more verbose. The only different between the above and the below, is that in the case of 3 in_service replicas and one or more inflight deletes, the above will return 1 new replica needed, but the below will return zero. The reasoning is that we should let the delete complete or not, as it may fail, and then deal with the over or under replication when the inflight operations have cleared. {code} /** * Calculates the the delta of replicas which need to be created or removed * to ensure the container is correctly replicated. * * Decisions around over-replication are made only on healthy replicas, * ignoring any in maintenance and also any inflight adds. InFlight adds are * ignored, as they may not complete, so if we have: * * H, H, H, IN_FLIGHT_ADD * * And then schedule a delete, we could end up under-replicated (add fails, * delete completes). It is better to let the inflight operations complete * and then deal with any further over or under replication. * * For maintenance replicas, assuming replication factor 3, and minHealthy * 2, it is possible for all 3 hosts to be put into maintenance, leaving the * following (H = healthy, M = maintenance): * * H, H, M, M, M * * Even though we are tracking 5 replicas, this is not over replicated as we * ignore the maintenance copies. Later, the replicas could look like: * * H, H, H, H, M * * At this stage, the container is over replicated by 1, so one replica can be * removed. * * For containers which have replication factor healthy replica, we ignore any * inflight add or deletes, as they may fail. Instead, wait for them to * complete and then deal with any excess or deficit. * * For under replicated containers we do consider inflight add and delete to * avoid scheduling more adds than needed. There is additional logic around * containers with maintenance replica to ensure minHealthyForMaintenance * replia are maintained/ * * @return Delta of replicas needed. Negative indicates over replication and * containers should be removed. Positive indicates over replication * and zero indicates the containers has replicationFactor healthy * replica */ public int additionalReplicaNeeded() { int delta = repFactor - healthyCount; if (delta < 0) { // Over replicated, so may need to remove a block. Do not consider // inFlightAdds, as they may fail, but do consider inFlightDel which // will reduce the over-replication if it completes. return delta + inFlightDel; } else if (delta > 0) { // May be under-replicated, depending on maintenance. When a container is // under-replicated, we must consider inflight add and delete when // calculating the new containers needed. delta = Math.max(0, delta - maintenanceCount); // Check we have enough healthy replicas minHealthyForMaintenance = Math.min(repFactor, minHealthyForMaintenance); int neededHealthy = Math.max(0, minHealthyForMaintenance - healthyCount); delta = Math.max(neededHealthy, delta);
[jira] [Commented] (HDDS-2372) Datanode pipeline is failing with NoSuchFileException
[ https://issues.apache.org/jira/browse/HDDS-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974483#comment-16974483 ] Anu Engineer commented on HDDS-2372: bq. It's possible to remove the usage of the tmp files but only if we allow overwrite for all the chunk files (in case of a leader failure the next attempt to write may find the previous chunk file in place). It may be accepted but it's a change with more risk. Why this is an enforced constraint? It is the artifact of our code. It should be trivial to check if file exists , and write chunk_file_v1, chunk_file_v2 etc. Anyway, as you mentioned, we will anyway rewrite this whole path. So it is probably ok to do what you think works now. > Datanode pipeline is failing with NoSuchFileException > - > > Key: HDDS-2372 > URL: https://issues.apache.org/jira/browse/HDDS-2372 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Critical > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Found it on a k8s based test cluster using a simple 3 node cluster and > HDDS-2327 freon test. After a while the StateMachine become unhealthy after > this error: > {code:java} > datanode-0 datanode java.util.concurrent.ExecutionException: > java.util.concurrent.ExecutionException: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > java.nio.file.NoSuchFileException: > /data/storage/hdds/2a77fab9-9dc5-4f73-9501-b5347ac6145c/current/containerDir0/1/chunks/gGYYgiTTeg_testdata_chunk_13931.tmp.2.20830 > {code} > Can be reproduced. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2459) Refactor ReplicationManager to consider maintenance states
[ https://issues.apache.org/jira/browse/HDDS-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974366#comment-16974366 ] Stephen O'Donnell edited comment on HDDS-2459 at 11/14/19 5:28 PM: --- In the decommission design doc, we had an algorithm to determine the number of replicas that need to be created or destroy so a container can be perfectly replicated. The algorithm was: {code} /** * Calculate the number of the missing replicas. * * @return the number of the missing replicas. If it's less than zero, the container is over replicated. */ int getReplicationCount(int expectedCount, int healthy, int maintenance, int inFlight) { //for over replication, count only with the healthy replicas if (expectedCount < healthy) { return expectedCount - healthy; } replicaCount = expectedCount - (healthy + maintenance + inFlight); if (replicaCount == 0 && healthy < 1) { replicaCount ++; } //over replication is already handled return Math.max(0, replicaCount); } {code} The code from the design doc needs a minor correction to handle inflight deletes on over replication, so it would look like this: {code} public int additionalReplicaNeeded2() { if (repFactor < healthyCount) { return repFactor - healthyCount + inFlightDel; } int delta = repFactor - (healthyCount + maintenanceCount + inFlightAdd - inFlightDel); if (delta == 0 && healthyCount < minHealthyForMaintenance) { delta += minHealthyForMaintenance - healthyCount; } return Math.max(0, delta); } {code} I also came up with the logic below, which is very similar although a little more verbose. The only different between the above and the below, is that in the case of 3 in_service replicas and one or more inflight deletes, the above will return 1 new replica needed, but the below will return zero. The reasoning is that we should let the delete complete or not, as it may fail, and then deal with the over or under replication when the inflight operations have cleared. {code} /** * Calculates the the delta of replicas which need to be created or removed * to ensure the container is correctly replicated. * * Decisions around over-replication are made only on healthy replicas, * ignoring any in maintenance and also any inflight adds. InFlight adds are * ignored, as they may not complete, so if we have: * * H, H, H, IN_FLIGHT_ADD * * And then schedule a delete, we could end up under-replicated (add fails, * delete completes). It is better to let the inflight operations complete * and then deal with any further over or under replication. * * For maintenance replicas, assuming replication factor 3, and minHealthy * 2, it is possible for all 3 hosts to be put into maintenance, leaving the * following (H = healthy, M = maintenance): * * H, H, M, M, M * * Even though we are tracking 5 replicas, this is not over replicated as we * ignore the maintenance copies. Later, the replicas could look like: * * H, H, H, H, M * * At this stage, the container is over replicated by 1, so one replica can be * removed. * * For containers which have replication factor healthy replica, we ignore any * inflight add or deletes, as they may fail. Instead, wait for them to * complete and then deal with any excess or deficit. * * For under replicated containers we do consider inflight add and delete to * avoid scheduling more adds than needed. There is additional logic around * containers with maintenance replica to ensure minHealthyForMaintenance * replia are maintained/ * * @return Delta of replicas needed. Negative indicates over replication and * containers should be removed. Positive indicates over replication * and zero indicates the containers has replicationFactor healthy * replica */ public int additionalReplicaNeeded() { int delta = repFactor - healthyCount; if (delta < 0) { // Over replicated, so may need to remove a block. Do not consider // inFlightAdds, as they may fail, but do consider inFlightDel which // will reduce the over-replication if it completes. return delta + inFlightDel; } else if (delta > 0) { // May be under-replicated, depending on maintenance. When a container is // under-replicated, we must consider inflight add and delete when // calculating the new containers needed. delta = Math.max(0, delta - maintenanceCount); // Check we have enough healthy replicas int neededHealthy = Math.max(0, minHealthyForMaintenance - healthyCount); delta = Math.max(neededHealthy, delta); return delta - inFlightAdd + inFlightDel; } else { // delta == 0 // We have exactly the number of healthy replicas needed, but there may
[jira] [Work logged] (HDDS-2481) Close streams in TarContainerPacker
[ https://issues.apache.org/jira/browse/HDDS-2481?focusedWorklogId=343561=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343561 ] ASF GitHub Bot logged work on HDDS-2481: Author: ASF GitHub Bot Created on: 14/Nov/19 17:25 Start Date: 14/Nov/19 17:25 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #167: HDDS-2481. Close streams in TarContainerPacker URL: https://github.com/apache/hadoop-ozone/pull/167 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343561) Time Spent: 20m (was: 10m) > Close streams in TarContainerPacker > --- > > Key: HDDS-2481 > URL: https://issues.apache.org/jira/browse/HDDS-2481 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Ensure various streams are closed in {{TarContainerPacker}}: > * > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-9bKcVY8lQ4ZsUH=AW5md-9bKcVY8lQ4ZsUH > * > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-9bKcVY8lQ4ZsUL=AW5md-9bKcVY8lQ4ZsUL > * > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-9bKcVY8lQ4ZsUK=AW5md-9bKcVY8lQ4ZsUK > * > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-9bKcVY8lQ4ZsUJ=AW5md-9bKcVY8lQ4ZsUJ > * > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-9bKcVY8lQ4ZsUI=AW5md-9bKcVY8lQ4ZsUI -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2459) Refactor ReplicationManager to consider maintenance states
[ https://issues.apache.org/jira/browse/HDDS-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974366#comment-16974366 ] Stephen O'Donnell edited comment on HDDS-2459 at 11/14/19 5:25 PM: --- In the decommission design doc, we had an algorithm to determine the number of replicas that need to be created or destroy so a container can be perfectly replicated. The algorithm was: {code} /** * Calculate the number of the missing replicas. * * @return the number of the missing replicas. If it's less than zero, the container is over replicated. */ int getReplicationCount(int expectedCount, int healthy, int maintenance, int inFlight) { //for over replication, count only with the healthy replicas if (expectedCount < healthy) { return expectedCount - healthy; } replicaCount = expectedCount - (healthy + maintenance + inFlight); if (replicaCount == 0 && healthy < 1) { replicaCount ++; } //over replication is already handled return Math.max(0, replicaCount); } {code} The code from the design doc needs a minor correction to handle inflight deletes on over replication, so it would look like this: {code} public int additionalReplicaNeeded2() { if (repFactor < healthyCount) { return repFactor - healthyCount + inFlightDel; } int delta = repFactor - (healthyCount + maintenanceCount + inFlightAdd - inFlightDel); if (delta == 0 && healthyCount < minHealthyForMaintenance) { delta += minHealthyForMaintenance - healthyCount; } return Math.max(0, delta); } {code} I also came up with the logic below, which is very similar although a little more verbose. The only different between the above and the below, is that in the case of 3 in_service replicas and one or more inflight deletes, the above will return 1 new replica needed, but the below will return zero. The reasoning is that we should let the delete complete or not, as it may fail, and then deal with the over or under replication when the inflight operations have cleared. {code} /** * Calculates the the delta of replicas which need to be created or removed * to ensure the container is correctly replicated. * * Decisions around over-replication are made only on healthy replicas, * ignoring any in maintenance and also any inflight adds. InFlight adds are * ignored, as they may not complete, so if we have: * * H, H, H, IN_FLIGHT_ADD * * And then schedule a delete, we could end up under-replicated (add fails, * delete completes). It is better to let the inflight operations complete * and then deal with any further over or under replication. * * For maintenance replicas, assuming replication factor 3, and minHealthy * 2, it is possible for all 3 hosts to be put into maintenance, leaving the * following (H = healthy, M = maintenance): * * H, H, M, M, M * * Even though we are tracking 5 replicas, this is not over replicated as we * ignore the maintenance copies. Later, the replicas could look like: * * H, H, H, H, M * * At this stage, the container is over replicated by 1, so one replica can be * removed. * * For containers which have replication factor healthy replica, we ignore any * inflight add or deletes, as they may fail. Instead, wait for them to * complete and then deal with any excess or deficit. * * For under replicated containers we do consider inflight add and delete to * avoid scheduling more adds than needed. There is additional logic around * containers with maintenance replica to ensure minHealthyForMaintenance * replia are maintained/ * * @return Delta of replicas needed. Negative indicates over replication and * containers should be removed. Positive indicates over replication * and zero indicates the containers has replicationFactor healthy * replica */ public int additionalReplicaNeeded() { int delta = repFactor - healthyCount; if (delta < 0) { // Over replicated, so may need to remove a block. Do not consider // inFlightAdds, as they may fail, but do consider inFlightDel which // will reduce the over-replication if it completes. return delta + inFlightDel; } else if (delta > 0) { // May be under-replicated, depending on maintenance. When a container is // under-replicated, we must consider inflight add and delete when // calculating the new containers needed. delta = Math.max(0, delta - maintenanceCount); // Check we have enough healthy replicas int neededHealthy = Math.max(0, minHealthyForMaintenance - healthyCount); delta = Math.max(neededHealthy, delta); return delta - inFlightAdd + inFlightDel; } else { // delta == 0 // We have exactly the number of healthy replicas needed, but there may
[jira] [Updated] (HDDS-2481) Close streams in TarContainerPacker
[ https://issues.apache.org/jira/browse/HDDS-2481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDDS-2481: --- Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) Thank you for fixing this issue. I greatly appreciate it. I have committed this patch to the master. > Close streams in TarContainerPacker > --- > > Key: HDDS-2481 > URL: https://issues.apache.org/jira/browse/HDDS-2481 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Minor > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Ensure various streams are closed in {{TarContainerPacker}}: > * > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-9bKcVY8lQ4ZsUH=AW5md-9bKcVY8lQ4ZsUH > * > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-9bKcVY8lQ4ZsUL=AW5md-9bKcVY8lQ4ZsUL > * > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-9bKcVY8lQ4ZsUK=AW5md-9bKcVY8lQ4ZsUK > * > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-9bKcVY8lQ4ZsUJ=AW5md-9bKcVY8lQ4ZsUJ > * > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-9bKcVY8lQ4ZsUI=AW5md-9bKcVY8lQ4ZsUI -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.
[ https://issues.apache.org/jira/browse/HDDS-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2473. Resolution: Fixed Thank you for an excellent patch. Much appreciated. I have committed this to the Master. > Fix code reliability issues found by Sonar in Ozone Recon module. > - > > Key: HDDS-2473 > URL: https://issues.apache.org/jira/browse/HDDS-2473 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Recon >Affects Versions: 0.5.0 >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > sonarcloud.io has flagged a number of code reliability issues in Ozone recon > (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon). > Following issues will be triaged / fixed. > * Double Brace Initialization should not be used > * Resources should be closed > * InterruptedException should not be ignored -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.
[ https://issues.apache.org/jira/browse/HDDS-2473?focusedWorklogId=343559=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343559 ] ASF GitHub Bot logged work on HDDS-2473: Author: ASF GitHub Bot Created on: 14/Nov/19 17:20 Start Date: 14/Nov/19 17:20 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #162: HDDS-2473. Fix code reliability issues found by Sonar in Ozone Recon module. URL: https://github.com/apache/hadoop-ozone/pull/162 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343559) Time Spent: 20m (was: 10m) > Fix code reliability issues found by Sonar in Ozone Recon module. > - > > Key: HDDS-2473 > URL: https://issues.apache.org/jira/browse/HDDS-2473 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Recon >Affects Versions: 0.5.0 >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > sonarcloud.io has flagged a number of code reliability issues in Ozone recon > (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon). > Following issues will be triaged / fixed. > * Double Brace Initialization should not be used > * Resources should be closed > * InterruptedException should not be ignored -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2450) Datanode ReplicateContainer thread pool should be configurable
[ https://issues.apache.org/jira/browse/HDDS-2450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDDS-2450: --- Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) [~avijayan] Thanks for the review. [~sodonnell] Thanks for the contribution. I have committed this patch to the Master branch. > Datanode ReplicateContainer thread pool should be configurable > -- > > Key: HDDS-2450 > URL: https://issues.apache.org/jira/browse/HDDS-2450 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > The replicateContainer command uses a ReplicationSupervisor object to > implement a threadpool used to process replication commands. > In DatanodeStateMachine this thread pool is initialized with a hard coded > number of threads (10). This should be made configurable with a default value > of 10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2450) Datanode ReplicateContainer thread pool should be configurable
[ https://issues.apache.org/jira/browse/HDDS-2450?focusedWorklogId=343554=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343554 ] ASF GitHub Bot logged work on HDDS-2450: Author: ASF GitHub Bot Created on: 14/Nov/19 17:12 Start Date: 14/Nov/19 17:12 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #134: HDDS-2450 Datanode ReplicateContainer thread pool should be configurable URL: https://github.com/apache/hadoop-ozone/pull/134 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343554) Time Spent: 20m (was: 10m) > Datanode ReplicateContainer thread pool should be configurable > -- > > Key: HDDS-2450 > URL: https://issues.apache.org/jira/browse/HDDS-2450 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Datanode >Affects Versions: 0.5.0 >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > The replicateContainer command uses a ReplicationSupervisor object to > implement a threadpool used to process replication commands. > In DatanodeStateMachine this thread pool is initialized with a hard coded > number of threads (10). This should be made configurable with a default value > of 10. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2479) Sonar : replace instanceof with catch block in XceiverClientGrpc.sendCommandWithRetry
[ https://issues.apache.org/jira/browse/HDDS-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2479. Fix Version/s: 0.5.0 Resolution: Fixed > Sonar : replace instanceof with catch block in > XceiverClientGrpc.sendCommandWithRetry > - > > Key: HDDS-2479 > URL: https://issues.apache.org/jira/browse/HDDS-2479 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Sonar issue: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV_=AW5md_AGKcVY8lQ4ZsV_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2479) Sonar : replace instanceof with catch block in XceiverClientGrpc.sendCommandWithRetry
[ https://issues.apache.org/jira/browse/HDDS-2479?focusedWorklogId=343546=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343546 ] ASF GitHub Bot logged work on HDDS-2479: Author: ASF GitHub Bot Created on: 14/Nov/19 17:01 Start Date: 14/Nov/19 17:01 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #168: HDDS-2479. Sonar : replace instanceof with catch block in XceiverClientGrpc sendCommandWithRetry URL: https://github.com/apache/hadoop-ozone/pull/168 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343546) Time Spent: 20m (was: 10m) > Sonar : replace instanceof with catch block in > XceiverClientGrpc.sendCommandWithRetry > - > > Key: HDDS-2479 > URL: https://issues.apache.org/jira/browse/HDDS-2479 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available, sonar > Time Spent: 20m > Remaining Estimate: 0h > > Sonar issue: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV_=AW5md_AGKcVY8lQ4ZsV_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2479) Sonar : replace instanceof with catch block in XceiverClientGrpc.sendCommandWithRetry
[ https://issues.apache.org/jira/browse/HDDS-2479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974437#comment-16974437 ] Anu Engineer commented on HDDS-2479: Thank you for the contribution. I have committed this patch to the master. > Sonar : replace instanceof with catch block in > XceiverClientGrpc.sendCommandWithRetry > - > > Key: HDDS-2479 > URL: https://issues.apache.org/jira/browse/HDDS-2479 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available, sonar > Time Spent: 20m > Remaining Estimate: 0h > > Sonar issue: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV_=AW5md_AGKcVY8lQ4ZsV_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2480) Sonar : remove log spam for exceptions inside XceiverClientGrpc.reconnect
[ https://issues.apache.org/jira/browse/HDDS-2480?focusedWorklogId=343531=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343531 ] ASF GitHub Bot logged work on HDDS-2480: Author: ASF GitHub Bot Created on: 14/Nov/19 16:54 Start Date: 14/Nov/19 16:54 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #170: HDDS-2480. Sonar : remove log spam for exceptions inside XceiverClientGrpc reconnect URL: https://github.com/apache/hadoop-ozone/pull/170 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343531) Time Spent: 20m (was: 10m) > Sonar : remove log spam for exceptions inside XceiverClientGrpc.reconnect > - > > Key: HDDS-2480 > URL: https://issues.apache.org/jira/browse/HDDS-2480 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Sonar issue: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsWE=AW5md_AGKcVY8lQ4ZsWE -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2480) Sonar : remove log spam for exceptions inside XceiverClientGrpc.reconnect
[ https://issues.apache.org/jira/browse/HDDS-2480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2480. Fix Version/s: 0.5.0 Resolution: Fixed [~sdeka] Thank you for the contribution. I have committed this patch to the master branch. > Sonar : remove log spam for exceptions inside XceiverClientGrpc.reconnect > - > > Key: HDDS-2480 > URL: https://issues.apache.org/jira/browse/HDDS-2480 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Sonar issue: > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsWE=AW5md_AGKcVY8lQ4ZsWE -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2478) Sonar : remove temporary variable in XceiverClientGrpc.sendCommand
[ https://issues.apache.org/jira/browse/HDDS-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer updated HDDS-2478: --- Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) [~adoroszlai] Thanks for the review. [~sdeka] Thanks for the patch. > Sonar : remove temporary variable in XceiverClientGrpc.sendCommand > -- > > Key: HDDS-2478 > URL: https://issues.apache.org/jira/browse/HDDS-2478 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Sonar issues : > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1 > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV2=AW5md_AGKcVY8lQ4ZsV2 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2478) Sonar : remove temporary variable in XceiverClientGrpc.sendCommand
[ https://issues.apache.org/jira/browse/HDDS-2478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974428#comment-16974428 ] Anu Engineer edited comment on HDDS-2478 at 11/14/19 4:51 PM: -- [~adoroszlai] Thanks for the review. [~sdeka] Thanks for the patch. I have committed this to the master branch. was (Author: anu): [~adoroszlai] Thanks for the review. [~sdeka] Thanks for the patch. > Sonar : remove temporary variable in XceiverClientGrpc.sendCommand > -- > > Key: HDDS-2478 > URL: https://issues.apache.org/jira/browse/HDDS-2478 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available, sonar > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Sonar issues : > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1 > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV2=AW5md_AGKcVY8lQ4ZsV2 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work logged] (HDDS-2478) Sonar : remove temporary variable in XceiverClientGrpc.sendCommand
[ https://issues.apache.org/jira/browse/HDDS-2478?focusedWorklogId=343526=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-343526 ] ASF GitHub Bot logged work on HDDS-2478: Author: ASF GitHub Bot Created on: 14/Nov/19 16:49 Start Date: 14/Nov/19 16:49 Worklog Time Spent: 10m Work Description: anuengineer commented on pull request #165: HDDS-2478. Sonar : remove temporary variable in XceiverClientGrpc sendCommand URL: https://github.com/apache/hadoop-ozone/pull/165 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 343526) Time Spent: 20m (was: 10m) > Sonar : remove temporary variable in XceiverClientGrpc.sendCommand > -- > > Key: HDDS-2478 > URL: https://issues.apache.org/jira/browse/HDDS-2478 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Supratim Deka >Assignee: Supratim Deka >Priority: Minor > Labels: pull-request-available, sonar > Time Spent: 20m > Remaining Estimate: 0h > > Sonar issues : > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1 > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV2=AW5md_AGKcVY8lQ4ZsV2 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14973) Balancer getBlocks RPC dispersal does not function properly
[ https://issues.apache.org/jira/browse/HDFS-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16974418#comment-16974418 ] Erik Krogen commented on HDFS-14973: Thanks for the comments [~shv], all great points. I put up patch v003 addressing them. > Balancer getBlocks RPC dispersal does not function properly > --- > > Key: HDFS-14973 > URL: https://issues.apache.org/jira/browse/HDFS-14973 > Project: Hadoop HDFS > Issue Type: Bug > Components: balancer mover >Affects Versions: 2.9.0, 2.7.4, 2.8.2, 3.0.0 >Reporter: Erik Krogen >Assignee: Erik Krogen >Priority: Major > Attachments: HDFS-14973.000.patch, HDFS-14973.001.patch, > HDFS-14973.002.patch, HDFS-14973.003.patch, HDFS-14973.test.patch > > > In HDFS-11384, a mechanism was added to make the {{getBlocks}} RPC calls > issued by the Balancer/Mover more dispersed, to alleviate load on the > NameNode, since {{getBlocks}} can be very expensive and the Balancer should > not impact normal cluster operation. > Unfortunately, this functionality does not function as expected, especially > when the dispatcher thread count is low. The primary issue is that the delay > is applied only to the first N threads that are submitted to the dispatcher's > executor, where N is the size of the dispatcher's threadpool, but *not* to > the first R threads, where R is the number of allowed {{getBlocks}} QPS > (currently hardcoded to 20). For example, if the threadpool size is 100 (the > default), threads 0-19 have no delay, 20-99 have increased levels of delay, > and 100+ have no delay. As I understand it, the intent of the logic was that > the delay applied to the first 100 threads would force the dispatcher > executor's threads to all be consumed, thus blocking subsequent (non-delayed) > threads until the delay period has expired. However, threads 0-19 can finish > very quickly (their work can often be fulfilled in the time it takes to > execute a single {{getBlocks}} RPC, on the order of tens of milliseconds), > thus opening up 20 new slots in the executor, which are then consumed by > non-delayed threads 100-119, and so on. So, although 80 threads have had a > delay applied, the non-delay threads rush through in the 20 non-delay slots. > This problem gets even worse when the dispatcher threadpool size is less than > the max {{getBlocks}} QPS. For example, if the threadpool size is 10, _no > threads ever have a delay applied_, and the feature is not enabled at all. > This problem wasn't surfaced in the original JIRA because the test > incorrectly measured the period across which {{getBlocks}} RPCs were > distributed. The variables {{startGetBlocksTime}} and {{endGetBlocksTime}} > were used to track the time over which the {{getBlocks}} calls were made. > However, {{startGetBlocksTime}} was initialized at the time of creation of > the {{FSNameystem}} spy, which is before the mock DataNodes are started. Even > worse, the Balancer in this test takes 2 iterations to complete balancing the > cluster, so the time period {{endGetBlocksTime - startGetBlocksTime}} > actually represents: > {code} > (time to submit getBlocks RPCs) + (DataNode startup time) + (time for the > Dispatcher to complete an iteration of moving blocks) > {code} > Thus, the RPC QPS reported by the test is much lower than the RPC QPS seen > during the period of initial block fetching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org