[jira] [Resolved] (HDFS-17449) Fix ill-formed decommission host name and port pair triggers IndexOutOfBound error
[ https://issues.apache.org/jira/browse/HDFS-17449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17449. - Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix ill-formed decommission host name and port pair triggers IndexOutOfBound > error > --- > > Key: HDFS-17449 > URL: https://issues.apache.org/jira/browse/HDFS-17449 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: ConfX >Assignee: ConfX >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > h2. What happened: > Got IndexOutOfBound when trying to run > org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor#testDecommissionStatusAfterDNRestart > with namenode host provider set to > org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager. > h2. Buggy code: > In HostsFileWriter.java: > {code:java} > String[] hostAndPort = hostNameAndPort.split(":"); // hostNameAndPort might > be invalid > dn.setHostName(hostAndPort[0]); > dn.setPort(Integer.parseInt(hostAndPort[1])); // here IndexOutOfBound might > be thrown > dn.setAdminState(AdminStates.DECOMMISSIONED);{code} > h2. StackTrace: > {code:java} > java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1 > at > org.apache.hadoop.hdfs.util.HostsFileWriter.initOutOfServiceHosts(HostsFileWriter.java:110){code} > h2. How to reproduce: > (1) Set {{dfs.namenode.hosts.provider.classname}} to > {{org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager}} > (2) Run test: > {{org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatusWithBackoffMonitor#testDecommissionStatusAfterDNRestart}} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17450) Add explicit dependency on httpclient jar
[ https://issues.apache.org/jira/browse/HDFS-17450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17450. - Fix Version/s: 3.4.1 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed > Add explicit dependency on httpclient jar > - > > Key: HDFS-17450 > URL: https://issues.apache.org/jira/browse/HDFS-17450 > Project: Hadoop HDFS > Issue Type: Task >Reporter: PJ Fanning >Assignee: PJ Fanning >Priority: Major > Labels: pull-request-available > Fix For: 3.4.1, 3.5.0 > > > Follow up to https://issues.apache.org/jira/browse/HADOOP-18890 > A previous [PR|https://github.com/apache/hadoop/pull/6057] for this issue > removed okhttp usage and used Apache HttpClient instead. The dependency on > HttpClient is indirect (a transitive dependency). I think it is better to > make the dependency explicit in hadoop-hdfs-client - the only project that > was significantly modified. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17448) Enhance the stability of the unit test TestDiskBalancerCommand
[ https://issues.apache.org/jira/browse/HDFS-17448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17448. - Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed > Enhance the stability of the unit test TestDiskBalancerCommand > > > Key: HDFS-17448 > URL: https://issues.apache.org/jira/browse/HDFS-17448 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > TestDiskBalancerCommand#testDiskBalancerQueryWithoutSubmitAndMultipleNodes > frequently fails tests, such as: > https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1540/testReport/junit/org.apache.hadoop.hdfs.server.diskbalancer.command/TestDiskBalancerCommand/testDiskBalancerQueryWithoutSubmitAndMultipleNodes/ > https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6637/1/testReport/org.apache.hadoop.hdfs.server.diskbalancer.command/TestDiskBalancerCommand/testDiskBalancerQueryWithoutSubmitAndMultipleNodes/ > I will fix it enhance the stability of the unit test. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17103) Fix file system cleanup in TestNameEditsConfigs
[ https://issues.apache.org/jira/browse/HDFS-17103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17103. - Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix file system cleanup in TestNameEditsConfigs > > > Key: HDFS-17103 > URL: https://issues.apache.org/jira/browse/HDFS-17103 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: ConfX >Assignee: ConfX >Priority: Critical > Labels: pull-request-available > Fix For: 3.5.0 > > Attachments: reproduce.sh > > > h2. What happened: > Got a {{NullPointerException}} without message when running > {{{}TestNameEditsConfigs{}}}. > h2. Where's the bug: > In line 450 of {{{}TestNameEditsConfigs{}}}, the test attempts to cleanup the > file system: > > {noformat} > ... > fileSys = cluster.getFileSystem(); > ... > } finally { > fileSys.close(); > cluster.shutdown(); > }{noformat} > However, the cleanup would result in a {{NullPointerException}} that covers > up the actual exception if the initialization of {{fileSys}} fails or another > exception is thrown before the line that initializes {{{}fileSys{}}}. > h2. How to reproduce: > (1) Set {{dfs.namenode.maintenance.replication.min}} to {{-1155969698}} > (2) Run test: > {{org.apache.hadoop.hdfs.server.namenode.TestNameEditsConfigs#testNameEditsConfigsFailure}} > h2. Stacktrace: > {noformat} > java.lang.NullPointerException, > at > org.apache.hadoop.hdfs.server.namenode.TestNameEditsConfigs.testNameEditsConfigsFailure(TestNameEditsConfigs.java:450),{noformat} > For an easy reproduction, run the reproduce.sh in the attachment. > We are happy to provide a patch if this issue is confirmed. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17317) DebugAdmin metaOut not need multiple close
[ https://issues.apache.org/jira/browse/HDFS-17317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17317. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > DebugAdmin metaOut not need multiple close > --- > > Key: HDFS-17317 > URL: https://issues.apache.org/jira/browse/HDFS-17317 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Reporter: xy >Assignee: xy >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > DebugAdmin metaOut not need multiple close -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17215) RBF: Fix some method annotations about @throws
[ https://issues.apache.org/jira/browse/HDFS-17215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17215. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Fix some method annotations about @throws > --- > > Key: HDFS-17215 > URL: https://issues.apache.org/jira/browse/HDFS-17215 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Affects Versions: 3.3.4 >Reporter: xiaojunxiang >Assignee: xiaojunxiang >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > The setQuota method annotation of the Quota class has an error, which is > described in the @throws section. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17056) EC: Fix verifyClusterSetup output in case of an invalid param.
[ https://issues.apache.org/jira/browse/HDFS-17056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17056. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > EC: Fix verifyClusterSetup output in case of an invalid param. > -- > > Key: HDFS-17056 > URL: https://issues.apache.org/jira/browse/HDFS-17056 > Project: Hadoop HDFS > Issue Type: Bug > Components: ec >Reporter: Ayush Saxena >Assignee: huangzhaobo99 >Priority: Major > Labels: newbie, pull-request-available > Fix For: 3.4.0 > > > {code:java} > bin/hdfs ec -verifyClusterSetup XOR-2-1-1024k > 9 DataNodes are required for the erasure coding policies: RS-6-3-1024k, > XOR-2-1-1024k. The number of DataNodes is only 3. {code} > verifyClusterSetup requires -policy then the name of policies, else it > defaults to all enabled policies. > In case there are additional invalid options it silently ignores them, unlike > other EC commands which throws out Too Many Argument exception. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16904) Close webhdfs during the teardown
[ https://issues.apache.org/jira/browse/HDFS-16904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16904. - Hadoop Flags: Reviewed Resolution: Fixed > Close webhdfs during the teardown > - > > Key: HDFS-16904 > URL: https://issues.apache.org/jira/browse/HDFS-16904 > Project: Hadoop HDFS > Issue Type: Test > Components: hdfs >Affects Versions: 3.4.0, 3.3.5, 3.3.9 > Environment: Tested using the Hadoop development environment Docker > image. >Reporter: Steve Vaughan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > > The teardown for the tests shutdown the cluster, but leaves HDFS open. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17034) java.io.FileNotFoundException: File does not exist
[ https://issues.apache.org/jira/browse/HDFS-17034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17034. - Resolution: Cannot Reproduce this is some cluster issue, reach out to user ML with all details, Jira is for reporting bug not for end user questions!!! > java.io.FileNotFoundException: File does not exist > -- > > Key: HDFS-17034 > URL: https://issues.apache.org/jira/browse/HDFS-17034 > Project: Hadoop HDFS > Issue Type: Bug > Components: dfs, dfsclient, hdfs >Affects Versions: 2.9.2 >Reporter: Jepson >Priority: Major > > *HBase2.2.2 Log:* > 2023-06-02 08:07:57,423 INFO [Close-WAL-Writer-177] util.FSHDFSUtils: > Recover lease on dfs file > /hbase/WALs/bdpprd07,16020,1685646099569/bdpprd07%2C16020%2C1685646099569.1685664417370 > 2023-06-02 08:07:57,425 INFO [Close-WAL-Writer-177] util.FSHDFSUtils: Failed > to recover lease, attempt=0 on > file=/hbase/WALs/bdpprd07,16020,1685646099569/bdpprd07%2C16020%2C1685646099569.1685664417370 > after 2ms > 2023-06-02 08:08:01,427 WARN [Close-WAL-Writer-177] wal.AsyncFSWAL: close > old writer failed > java.io.FileNotFoundException: File does not exist: > /hbase/WALs/bdpprd07,16020,1685646099569/bdpprd07%2C16020%2C1685646099569.1685664417370 > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:72) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:62) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLease(FSNamesystem.java:2358) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.recoverLease(NameNodeRpcServer.java:790) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.recoverLease(ClientNamenodeProtocolServerSideTranslatorPB.java:693) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:503) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:871) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:817) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2606) > at sun.reflect.GeneratedConstructorAccessor33.newInstance(Unknown > Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88) > at org.apache.hadoop.hdfs.DFSClient.recoverLease(DFSClient.java:867) > at > org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304) > at > org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:301) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.recoverLease(DistributedFileSystem.java:301) > at > org.apache.hadoop.hbase.util.FSHDFSUtils.recoverLease(FSHDFSUtils.java:283) > at > org.apache.hadoop.hbase.util.FSHDFSUtils.recoverDFSFileLease(FSHDFSUtils.java:216) > at > org.apache.hadoop.hbase.util.FSHDFSUtils.recoverFileLease(FSHDFSUtils.java:163) > at > org.apache.hadoop.hbase.io.asyncfs.FanOutOneBlockAsyncDFSOutput.recoverAndClose(FanOutOneBlockAsyncDFSOutput.java:559) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncProtobufLogWriter.close(AsyncProtobufLogWriter.java:157) > at > org.apache.hadoop.hbase.regionserver.wal.AsyncFSWAL.lambda$closeWriter$6(AsyncFSWAL.java:646) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File > does not exist: > /hbase/WALs/bdpprd07,16020,1685646099569/bdpprd07%2C16020%2C1685646099569.1685664417370 > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:72) > at >
[jira] [Resolved] (HDFS-17238) Setting the value of "dfs.blocksize" too large will cause HDFS to be unable to write to files
[ https://issues.apache.org/jira/browse/HDFS-17238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17238. - Resolution: Won't Fix this is a misconfiguration, can't help it nor we can handle all such issues > Setting the value of "dfs.blocksize" too large will cause HDFS to be unable > to write to files > - > > Key: HDFS-17238 > URL: https://issues.apache.org/jira/browse/HDFS-17238 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.3.6 >Reporter: ECFuzz >Priority: Major > > My hadoop version is 3.3.6, and I use the Pseudo-Distributed Operation. > core-site.xml like below. > {code:java} > > > fs.defaultFS > hdfs://localhost:9000 > > > hadoop.tmp.dir > /home/hadoop/Mutil_Component/tmp > > > {code} > hdfs-site.xml like below. > {code:java} > > > dfs.replication > 1 > > > dfs.blocksize > 134217728 > > > {code} > And then format the namenode, and start the hdfs. HDFS is running normally. > {code:java} > hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ > bin/hdfs namenode -format > x(many info) > hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ > sbin/start-dfs.sh > Starting namenodes on [localhost] > Starting datanodes > Starting secondary namenodes [hadoop-Standard-PC-i440FX-PIIX-1996] {code} > Finally, use dfs to place a file. > {code:java} > bin/hdfs dfs -mkdir -p /user/hadoop > bin/hdfs dfs -mkdir input > bin/hdfs dfs -put etc/hadoop/*.xml input {code} > Discovering Exception Throwing. > {code:java} > 2023-10-19 14:56:34,603 WARN hdfs.DataStreamer: DataStreamer Exception > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /user/hadoop/input/capacity-scheduler.xml._COPYING_ could only be written to > 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 > node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2350) > at > org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2989) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:912) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:595) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1094) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1017) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3048) > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1567) > at org.apache.hadoop.ipc.Client.call(Client.java:1513) > at org.apache.hadoop.ipc.Client.call(Client.java:1410) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:258) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:139) > at com.sun.proxy.$Proxy9.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:531) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:433) > at >
[jira] [Resolved] (HDFS-17240) Fix a typo in DataStorage.java
[ https://issues.apache.org/jira/browse/HDFS-17240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17240. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix a typo in DataStorage.java > -- > > Key: HDFS-17240 > URL: https://issues.apache.org/jira/browse/HDFS-17240 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Yu Wang >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0 > > > Fix a typo in DataStorage.java > > {code:java} > /** > - * Analize which and whether a transition of the fs state is required > + * Analyze which and whether a transition of the fs state is required > * and perform it if necessary. > * {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17282) Reconfig 'SlowIoWarningThreshold' parameters for datanode.
[ https://issues.apache.org/jira/browse/HDFS-17282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17282. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Reconfig 'SlowIoWarningThreshold' parameters for datanode. > -- > > Key: HDFS-17282 > URL: https://issues.apache.org/jira/browse/HDFS-17282 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: huangzhaobo99 >Assignee: huangzhaobo99 >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17278) Detect order dependent flakiness in TestViewfsWithNfs3.java under hadoop-hdfs-nfs module
[ https://issues.apache.org/jira/browse/HDFS-17278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17278. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Detect order dependent flakiness in TestViewfsWithNfs3.java under > hadoop-hdfs-nfs module > > > Key: HDFS-17278 > URL: https://issues.apache.org/jira/browse/HDFS-17278 > Project: Hadoop HDFS > Issue Type: Bug > Environment: openjdk version "17.0.9" > Apache Maven 3.9.5 >Reporter: Ruby >Assignee: Ruby >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: failed-1.png, failed-2.png, success.png > > > The order dependent flakiness was detected if the test class > TestDFSClientCache.java runs before TestRpcProgramNfs3.java. > The error message looks like below: > {code:java} > [ERROR] Failures: > [ERROR] TestRpcProgramNfs3.testAccess:279 Incorrect return code > expected:<0> but was:<13> > [ERROR] TestRpcProgramNfs3.testCommit:764 Incorrect return code: > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testCreate:493 Incorrect return code: > expected:<13> but was:<5> > [ERROR] > TestRpcProgramNfs3.testEncryptedReadWrite:359->createFileUsingNfs:393 > Incorrect response: expected: but > was: > [ERROR] TestRpcProgramNfs3.testFsinfo:714 Incorrect return code: > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testFsstat:696 Incorrect return code: > expected:<0> but was:<13> > [ERROR] TestRpcProgramNfs3.testGetattr:205 Incorrect return code > expected:<0> but was:<13> > [ERROR] TestRpcProgramNfs3.testLookup:249 Incorrect return code > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testMkdir:517 Incorrect return code: > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testPathconf:738 Incorrect return code: > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testRead:341 Incorrect return code: expected:<0> > but was:<13> > [ERROR] TestRpcProgramNfs3.testReaddir:642 Incorrect return code: > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testReaddirplus:666 Incorrect return code: > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testReadlink:297 Incorrect return code: > expected:<0> but was:<5> > [ERROR] TestRpcProgramNfs3.testRemove:570 Incorrect return code: > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testRename:618 Incorrect return code: > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testRmdir:594 Incorrect return code: > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testSetattr:225 Incorrect return code > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testSymlink:546 Incorrect return code: > expected:<13> but was:<5> > [ERROR] TestRpcProgramNfs3.testWrite:468 Incorrect return code: > expected:<13> but was:<5> > [INFO] > [ERROR] Tests run: 25, Failures: 20, Errors: 0, Skipped: 0 > [INFO] > [ERROR] There are test failures. {code} > The polluter that led to this flakiness was the test method > testGetUserGroupInformationSecure() in TestDFSClientCache.java. There was a > line > {code:java} > UserGroupInformation.setLoginUser(currentUserUgi);{code} > which modifies some shared state and resource, something like pre-setup the > config. To fix this issue, I added the cleanup methods in > TestDFSClientCache.java to reset the UserGroupInformation to ensure the > isolation among each test class. > {code:java} > @AfterClass > public static void cleanup() { > UserGroupInformation.reset(); > }{code} > Including setting > {code:java} > authenticationMethod = null; > conf = null; // set configuration to null > setLoginUser(null); // reset login user to default null{code} > ..., and so on. The reset() methods can be referred to > hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/UserGroupInformation.java. > After the fix, the error was no longer exist and the succeed message was: > {code:java} > [INFO] --- > [INFO] T E S T S > [INFO] --- > [INFO] Running org.apache.hadoop.hdfs.nfs.nfs3.CustomTest > [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: > 18.457 s - in org.apache.hadoop.hdfs.nfs.nfs3.CustomTest > [INFO] > [INFO] Results: > [INFO] > [INFO] Tests run: 25, Failures: 0, Errors: 0, Skipped: 0 > [INFO] > [INFO] > > [INFO] BUILD SUCCESS > [INFO] > > {code} > Here is the CustomTest.java file that I used to run these two tests in order, > the
[jira] [Resolved] (HDFS-17279) RBF: Fix link to Fedbalance document
[ https://issues.apache.org/jira/browse/HDFS-17279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17279. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Fix link to Fedbalance document > - > > Key: HDFS-17279 > URL: https://issues.apache.org/jira/browse/HDFS-17279 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: screenshot-1.png > > > !screenshot-1.png! > Fix link to Fedbalance document cannot be displayed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17260) Fix the logic for reconfigure slow peer enable for Namenode.
[ https://issues.apache.org/jira/browse/HDFS-17260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17260. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix the logic for reconfigure slow peer enable for Namenode. > > > Key: HDFS-17260 > URL: https://issues.apache.org/jira/browse/HDFS-17260 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: huangzhaobo99 >Assignee: huangzhaobo99 >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17233) The conf dfs.datanode.lifeline.interval.seconds is not considering time unit seconds
[ https://issues.apache.org/jira/browse/HDFS-17233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17233. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > The conf dfs.datanode.lifeline.interval.seconds is not considering time unit > seconds > > > Key: HDFS-17233 > URL: https://issues.apache.org/jira/browse/HDFS-17233 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Hemanth Boyina >Assignee: Palakur Eshwitha Sai >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > {code:java} > long confLifelineIntervalMs = > getConf().getLong(DFS_DATANODE_LIFELINE_INTERVAL_SECONDS_KEY, > 3 * getConf().getTimeDuration(DFS_HEARTBEAT_INTERVAL_KEY, > DFS_HEARTBEAT_INTERVAL_DEFAULT, TimeUnit.SECONDS, > TimeUnit.MILLISECONDS)); {code} > if we configure DFS_DATANODE_LIFELINE_INTERVAL_SECONDS_KEY, the value is not > converting to Ms. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17271) Web UI DN report shows random order when sorting with dead DNs
[ https://issues.apache.org/jira/browse/HDFS-17271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17271. - Resolution: Fixed > Web UI DN report shows random order when sorting with dead DNs > -- > > Key: HDFS-17271 > URL: https://issues.apache.org/jira/browse/HDFS-17271 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode, rbf, ui >Affects Versions: 3.4.0 >Reporter: Felix N >Assignee: Felix N >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: image-2023-12-01-15-04-11-047.png > > > When sorted by "last contact" in ascending order, dead nodes come up on top > in a random order > !image-2023-12-01-15-04-11-047.png|width=337,height=263! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17261) RBF: Fix getFileInfo return wrong path when get mountTable path which multi-level
[ https://issues.apache.org/jira/browse/HDFS-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17261. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Fix getFileInfo return wrong path when get mountTable path which > multi-level > - > > Key: HDFS-17261 > URL: https://issues.apache.org/jira/browse/HDFS-17261 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: liuguanghua >Assignee: liuguanghua >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > With DFSRouter, Suppose there are two nameservices : ns0,ns1 > # Add mountTable /testgetfileinfo/ns1/dir -> (ns1 -> > /testgetfileinfo/ns1/dir) > # hdfs client via DFSRouter accesses a directory: hdfs dfs -ls -d > /testgetfileinfo > # it will return worng path : /testgetfileinfo/testgetfileinfo > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17259) Fix typo in TestFsDatasetImpl Class.
[ https://issues.apache.org/jira/browse/HDFS-17259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17259. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix typo in TestFsDatasetImpl Class. > > > Key: HDFS-17259 > URL: https://issues.apache.org/jira/browse/HDFS-17259 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: huangzhaobo99 >Assignee: huangzhaobo99 >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17235) Fix javadoc errors in BlockManager
[ https://issues.apache.org/jira/browse/HDFS-17235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17235. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix javadoc errors in BlockManager > -- > > Key: HDFS-17235 > URL: https://issues.apache.org/jira/browse/HDFS-17235 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > There are 2 errors in BlockManager.java > https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6194/4/artifact/out/patch-javadoc-hadoop-hdfs-project_hadoop-hdfs-jdkUbuntu-11.0.20.1+1-post-Ubuntu-0ubuntu120.04.txt > {code:java} > [ERROR] > /home/jenkins/jenkins-agent/workspace/hadoop-multibranch_PR-6194/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:153: > error: reference not found > [ERROR] * by {@link DFS_NAMENODE_MAINTENANCE_REPLICATION_MIN_KEY}. This > number has to = > [ERROR] ^ > [ERROR] > /home/jenkins/jenkins-agent/workspace/hadoop-multibranch_PR-6194/ubuntu-focal/src/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java:154: > error: reference not found > [ERROR] * {@link DFS_NAMENODE_REPLICATION_MIN_KEY}. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17228) Improve documentation related to BlockManager
[ https://issues.apache.org/jira/browse/HDFS-17228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17228. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Improve documentation related to BlockManager > - > > Key: HDFS-17228 > URL: https://issues.apache.org/jira/browse/HDFS-17228 > Project: Hadoop HDFS > Issue Type: Improvement > Components: block placement, documentation >Affects Versions: 3.3.3, 3.3.6 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: image-2023-10-17-17-25-27-363.png > > > In the BlockManager file, some important comments are missing. > Happens here: > !image-2023-10-17-17-25-27-363.png! > If it is improved, the robustness of the distributed system can be increased. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-17225) Fix TestNameNodeMXBean#testDecommissioningNodes
Ayush Saxena created HDFS-17225: --- Summary: Fix TestNameNodeMXBean#testDecommissioningNodes Key: HDFS-17225 URL: https://issues.apache.org/jira/browse/HDFS-17225 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ayush Saxena Fails in assertion {noformat} org.junit.ComparisonFailure: expected:<...commissionDuration":[2]}}> but was:<...commissionDuration":[1]}}> at org.junit.Assert.assertEquals(Assert.java:117) at org.junit.Assert.assertEquals(Assert.java:146) at org.apache.hadoop.hdfs.server.namenode.TestNameNodeMXBean.testDecommissioningNodes(TestNameNodeMXBean.java:432){noformat} [https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6185/1/testReport/org.apache.hadoop.hdfs.server.namenode/TestNameNodeMXBean/testDecommissioningNodes/] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17200) Add some datanode related metrics to Metrics.md
[ https://issues.apache.org/jira/browse/HDFS-17200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17200. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Add some datanode related metrics to Metrics.md > --- > > Key: HDFS-17200 > URL: https://issues.apache.org/jira/browse/HDFS-17200 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: huangzhaobo99 >Assignee: huangzhaobo99 >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17205) HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable
[ https://issues.apache.org/jira/browse/HDFS-17205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17205. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable > --- > > Key: HDFS-17205 > URL: https://issues.apache.org/jira/browse/HDFS-17205 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Current allocate new block , the NameNode will choose datanode and choose a > good storage of given storage type from datanode, the specific calling code > is DatanodeDescriptor#chooseStorage4Block, here will calculate the space > required for write operations, > requiredSize = blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE(default > is 1). > {code:java} > public DatanodeStorageInfo chooseStorage4Block(StorageType t, > long blockSize) { > final long requiredSize = > blockSize * HdfsServerConstants.MIN_BLOCKS_FOR_WRITE; > final long scheduledSize = blockSize * getBlocksScheduled(t); > long remaining = 0; > DatanodeStorageInfo storage = null; > for (DatanodeStorageInfo s : getStorageInfos()) { > if (s.getState() == State.NORMAL && s.getStorageType() == t) { > if (storage == null) { > storage = s; > } > long r = s.getRemaining(); > if (r >= requiredSize) { > remaining += r; > } > } > } > if (requiredSize > remaining - scheduledSize) { > BlockPlacementPolicy.LOG.debug( > "The node {} does not have enough {} space (required={}," > + " scheduled={}, remaining={}).", > this, t, requiredSize, scheduledSize, remaining); > return null; > } > return storage; > } > {code} > But when multiple NameSpaces select the storage of the same datanode to write > blocks at the same time. > In extreme cases, if there is only one block size left in the current > storage, there will be a situation where there is not enough free space for > the writer to write data. > log similar to the following appears: > {code:java} > The volume [file:/disk1/] with the available space (=21129618 B) is less than > the block size (=268435456 B). > {code} > In order to avoid this case, consider > HdfsServerConstants.MIN_BLOCKS_FOR_WRITE should be configurable, and the > parameters can be adjusted in larger clusters. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17209) Correct comments to align with the code
[ https://issues.apache.org/jira/browse/HDFS-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17209. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Correct comments to align with the code > --- > > Key: HDFS-17209 > URL: https://issues.apache.org/jira/browse/HDFS-17209 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.3.6 >Reporter: Yu Wang >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0 > > > The waiting time in the comments is inconsistent with the code in > DataNode.java > {code:java} > public void shutdown() { > ... > while (true) { > // When shutting down for restart, wait 2.5 seconds before forcing > // termination of receiver threads. > if (!this.shutdownForUpgrade || > (this.shutdownForUpgrade && (Time.monotonicNow() - timeNotified > > 1000))) { > this.threadGroup.interrupt(); > break; > }{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17133) TestFsDatasetImpl missing null check when cleaning up
[ https://issues.apache.org/jira/browse/HDFS-17133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17133. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > TestFsDatasetImpl missing null check when cleaning up > - > > Key: HDFS-17133 > URL: https://issues.apache.org/jira/browse/HDFS-17133 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: ConfX >Assignee: ConfX >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: reproduce.sh > > > h2. What happened > I turned on {{dfs.namenode.quota.init-threads=1468568631}} and then the test > {{{}org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl#testMoveBlockFailure{}}}fails > with null pointer. > h2. Where's the problem > In the clean up part of the test: > {noformat} > } finally { > if (cluster.isClusterUp()) { > cluster.shutdown(); > } > }{noformat} > if cluster is null, the test would directly fail with a null pointer > exception and hiding potentially the actual failure. > h2. How to reproduce > # set {{{}dfs.namenode.quota.init-threads{}}}={{{}1468568631 {}}} > # run > {{org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl#testMoveBlockFailure}} > you should observe > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testMoveBlockFailure(TestFsDatasetImpl.java:1005){noformat} > For an easy reproduction, run the reproduce.sh in the attachment. > We are happy to provide a patch if this issue is confirmed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17211) Fix comments in the RemoteParam class
[ https://issues.apache.org/jira/browse/HDFS-17211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17211. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix comments in the RemoteParam class > - > > Key: HDFS-17211 > URL: https://issues.apache.org/jira/browse/HDFS-17211 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.3.4, 3.3.6 >Reporter: hellosrc >Assignee: hellosrc >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: image-2023-09-27-16-18-22-421.png > > > RemoteParam's constructor comment says RemoveLocationContext, but there is no > RemoveLocationContext class in HadoopProject, which I think is intended to > express RemoteLocationContext,so it should be changed to > RemoteLocationContext. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17115) HttpFS Add Support getErasureCodeCodecs API
[ https://issues.apache.org/jira/browse/HDFS-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17115. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > HttpFS Add Support getErasureCodeCodecs API > --- > > Key: HDFS-17115 > URL: https://issues.apache.org/jira/browse/HDFS-17115 > Project: Hadoop HDFS > Issue Type: Improvement > Components: httpfs >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Assignee: Hualong Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > We should ensure that *WebHDFS* remains synchronized with {*}HttpFS{*}, as > the former has already implemented the *getErasureCodeCodecs* interface. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17111) RBF: Optimize msync to only call nameservices that have observer reads enabled.
[ https://issues.apache.org/jira/browse/HDFS-17111?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17111. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Optimize msync to only call nameservices that have observer reads > enabled. > --- > > Key: HDFS-17111 > URL: https://issues.apache.org/jira/browse/HDFS-17111 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Simbarashe Dzinamarira >Assignee: Simbarashe Dzinamarira >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Right now when a client MSYNCs to the router, the call is fanned out to all > nameservices. We only need to proxy the msync to nameservices that have > observer reads configured. > We can do this either by adding a new config for the admin to specify which > nameservices have CRS configured, or we can try to automatically detect these. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17119) RBF: Logger fix for StateStoreMySQLImpl
[ https://issues.apache.org/jira/browse/HDFS-17119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17119. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Logger fix for StateStoreMySQLImpl > --- > > Key: HDFS-17119 > URL: https://issues.apache.org/jira/browse/HDFS-17119 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Zhaohui Wang >Assignee: Zhaohui Wang >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17074) Remove incorrect comment in TestRedudantBlocks#setup
[ https://issues.apache.org/jira/browse/HDFS-17074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17074. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Remove incorrect comment in TestRedudantBlocks#setup > > > Key: HDFS-17074 > URL: https://issues.apache.org/jira/browse/HDFS-17074 > Project: Hadoop HDFS > Issue Type: Improvement > Components: test >Affects Versions: 3.4.0 >Reporter: farmmamba >Assignee: farmmamba >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0 > > > In TestRedudantBlocks#setup(), The below comment is incorrect. > {code:java} > // disable block recovery > conf.setInt(DFSConfigKeys.DFS_NAMENODE_REDUNDANCY_INTERVAL_SECONDS_KEY, 1); > conf.setInt(DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY, 1);{code} > We should delete this comment. > The correct usage is in TestAddOverReplicatedStripedBlocks#setup() > {code:java} > // disable block recovery > conf.setInt(DFSConfigKeys.DFS_NAMENODE_REPLICATION_MAX_STREAMS_KEY, 0); > conf.setInt(DFSConfigKeys.DFS_NAMENODE_REDUNDANCY_INTERVAL_SECONDS_KEY, 1); > conf.setInt(DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY, 1); {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17075) Reconfig disk balancer parameters for datanode
[ https://issues.apache.org/jira/browse/HDFS-17075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17075. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Reconfig disk balancer parameters for datanode > -- > > Key: HDFS-17075 > URL: https://issues.apache.org/jira/browse/HDFS-17075 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Rolling restart datanodes takes long time, can make disk balanacer parameters > dfs.disk.balancer.enabled and dfs.disk.balancer.plan.valid.interval in > datanode reconfigurable to facilitate cluster operation and maintenance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17086) Fix the parameter settings in TestDiskspaceQuotaUpdate#updateCountForQuota.
[ https://issues.apache.org/jira/browse/HDFS-17086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17086. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix the parameter settings in TestDiskspaceQuotaUpdate#updateCountForQuota. > --- > > Key: HDFS-17086 > URL: https://issues.apache.org/jira/browse/HDFS-17086 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17069) The documentation and implementation of "dfs.blocksize" are inconsistent.
[ https://issues.apache.org/jira/browse/HDFS-17069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17069. - Resolution: Not A Problem > The documentation and implementation of "dfs.blocksize" are inconsistent. > - > > Key: HDFS-17069 > URL: https://issues.apache.org/jira/browse/HDFS-17069 > Project: Hadoop HDFS > Issue Type: Bug > Components: dfs, documentation >Affects Versions: 3.3.6 > Environment: Linux version 4.15.0-142-generic > (buildd@lgw01-amd64-039) (gcc version 5.4.0 20160609 (Ubuntu > 5.4.0-6ubuntu1~16.04.12)) > java version "1.8.0_162" > Java(TM) SE Runtime Environment (build 1.8.0_162-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, mixed mode) >Reporter: ECFuzz >Priority: Major > Labels: pull-request-available > > My hadoop version is 3.3.6, and I use the Pseudo-Distributed Operation. > core-site.xml like below. > {code:java} > > > fs.defaultFS > hdfs://localhost:9000 > > > hadoop.tmp.dir > /home/hadoop/Mutil_Component/tmp > > > {code} > hdfs-site.xml like below. > {code:java} > > > dfs.replication > 1 > > > dfs.blocksize > 128k > > > {code} > And then format the namenode, and start the hdfs. > {code:java} > hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ > bin/hdfs namenode -format > x(many info) > hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ > sbin/start-dfs.sh > Starting namenodes on [localhost] > Starting datanodes > Starting secondary namenodes [hadoop-Standard-PC-i440FX-PIIX-1996]{code} > Finally, use dfs to put a file. Then I get the message which means 128k is > less than 1M. > > {code:java} > hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ > bin/hdfs dfs -mkdir -p /user/hadoop > hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ > bin/hdfs dfs -mkdir input > hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ > bin/hdfs dfs -put etc/hadoop/hdfs-site.xml input > put: Specified block size is less than configured minimum value > (dfs.namenode.fs-limits.min-block-size): 131072 < 1048576 > {code} > But I find that in the document, dfs.blocksize can be set like 128k and other > values in hdfs-default.xml . > {code:java} > The default block size for new files, in bytes. You can use the following > suffix (case insensitive): k(kilo), m(mega), g(giga), t(tera), p(peta), > e(exa) to specify the size (such as 128k, 512m, 1g, etc.), Or provide > complete size in bytes (such as 134217728 for 128 MB).{code} > So, should there be some issues with the documents here?Or should notice user > to set this configuration to be larger than 1M? > > Additionally, I start the yarn and run the given mapreduce job. > {code:java} > hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ > sbin/start-yarn.sh > hadoop@hadoop-Standard-PC-i440FX-PIIX-1996:~/Mutil_Component/hadoop-3.3.6$ > bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar > grep input output 'dfs[a-z.]+'{code} > And, the shell will throw some exceptions like below. > {code:java} > 2023-07-12 15:12:29,964 INFO client.DefaultNoHARMFailoverProxyProvider: > Connecting to ResourceManager at /0.0.0.0:8032 > 2023-07-12 15:12:30,430 INFO mapreduce.JobResourceUploader: Disabling Erasure > Coding for path: > /tmp/hadoop-yarn/staging/hadoop/.staging/job_1689145947338_0001 > 2023-07-12 15:12:30,542 INFO mapreduce.JobSubmitter: Cleaning up the staging > area /tmp/hadoop-yarn/staging/hadoop/.staging/job_1689145947338_0001 > org.apache.hadoop.ipc.RemoteException(java.io.IOException): Specified block > size is less than configured minimum value > (dfs.namenode.fs-limits.min-block-size): 131072 < 1048576 > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2690) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2625) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:807) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:496) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621) > at >
[jira] [Resolved] (HDFS-17068) Datanode should record last directory scan time.
[ https://issues.apache.org/jira/browse/HDFS-17068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17068. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Datanode should record last directory scan time. > > > Key: HDFS-17068 > URL: https://issues.apache.org/jira/browse/HDFS-17068 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: farmmamba >Assignee: farmmamba >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > I think it is useful for us to record last directory scan time for one > datanode. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17081) EC: Add logic for striped blocks in isSufficientlyReplicated
[ https://issues.apache.org/jira/browse/HDFS-17081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17081. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > EC: Add logic for striped blocks in isSufficientlyReplicated > > > Key: HDFS-17081 > URL: https://issues.apache.org/jira/browse/HDFS-17081 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Append ec file check if a block is replicated to at least the minimum > replication need consider ec block. > currently only the minimum replication of the replica is considered, the code > is as follows: > {code:java} > /** >* Check if a block is replicated to at least the minimum replication. >*/ > public boolean isSufficientlyReplicated(BlockInfo b) { > // Compare against the lesser of the minReplication and number of live > DNs. > final int liveReplicas = countNodes(b).liveReplicas(); > if (liveReplicas >= minReplication) { > return true; > } > // getNumLiveDataNodes() is very expensive and we minimize its use by > // comparing with minReplication first. > return liveReplicas >= getDatanodeManager().getNumLiveDataNodes(); > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17083) Support getErasureCodeCodecs API in WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-17083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17083. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Support getErasureCodeCodecs API in WebHDFS > --- > > Key: HDFS-17083 > URL: https://issues.apache.org/jira/browse/HDFS-17083 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Assignee: Hualong Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: image-2023-07-12-22-52-15-954.png > > > WebHDFS should support getErasureCodeCodecs: > !image-2023-07-12-22-52-15-954.png|width=799,height=210! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17082) Add documentation for provisionSnapshotTrash command to HDFSCommands.md and HdfsSnapshots.md
[ https://issues.apache.org/jira/browse/HDFS-17082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17082. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Add documentation for provisionSnapshotTrash command to HDFSCommands.md and > HdfsSnapshots.md > - > > Key: HDFS-17082 > URL: https://issues.apache.org/jira/browse/HDFS-17082 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > HDFS-15607 and HDFS-15997 introduced provisionSnapshotTrash should add it to > the document. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17076) Remove the unused method isSlownodeByNameserviceId in DataNode
[ https://issues.apache.org/jira/browse/HDFS-17076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17076. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Remove the unused method isSlownodeByNameserviceId in DataNode > -- > > Key: HDFS-17076 > URL: https://issues.apache.org/jira/browse/HDFS-17076 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Remove the unused method isSlownodeByNameserviceId() in DataNode. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17064) Document the usage of the new Balancer "sortTopNodes" and "hotBlockTimeInterval" parameter
[ https://issues.apache.org/jira/browse/HDFS-17064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17064. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Document the usage of the new Balancer "sortTopNodes" and > "hotBlockTimeInterval" parameter > -- > > Key: HDFS-17064 > URL: https://issues.apache.org/jira/browse/HDFS-17064 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17070) Remove unused import in DataNodeMetricHelper.java.
[ https://issues.apache.org/jira/browse/HDFS-17070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17070. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Remove unused import in DataNodeMetricHelper.java. > -- > > Key: HDFS-17070 > URL: https://issues.apache.org/jira/browse/HDFS-17070 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: farmmamba >Assignee: farmmamba >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0 > > > Remove unused import in DataNodeMetricHelper.java. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-17056) EC: Fix verifyClusterSetup output in case of an invalid param
Ayush Saxena created HDFS-17056: --- Summary: EC: Fix verifyClusterSetup output in case of an invalid param Key: HDFS-17056 URL: https://issues.apache.org/jira/browse/HDFS-17056 Project: Hadoop HDFS Issue Type: Bug Components: ec Reporter: Ayush Saxena {code:java} bin/hdfs ec -verifyClusterSetup XOR-2-1-1024k 9 DataNodes are required for the erasure coding policies: RS-6-3-1024k, XOR-2-1-1024k. The number of DataNodes is only 3. {code} verifyClusterSetup requires -policy then the name of policies, else it defaults to all enabled policies. In case there are additional invalid options it silently ignores them, unlike other EC commands which throws out Too Many Argument exception. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17053) Optimize method BlockInfoStriped#findSlot to reduce time complexity.
[ https://issues.apache.org/jira/browse/HDFS-17053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17053. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Optimize method BlockInfoStriped#findSlot to reduce time complexity. > > > Key: HDFS-17053 > URL: https://issues.apache.org/jira/browse/HDFS-17053 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: farmmamba >Assignee: farmmamba >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0 > > > Currently, in method findSlot. there exists codes snippet below: > {code:java} > for (; i < getCapacity(); i++) { > if (getStorageInfo(i) == null) { > return i; > } > } {code} > it will compute (triplets.length / 3;) every iteration, I think this can be > optimized. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17043) HttpFS implementation for getAllErasureCodingPolicies
[ https://issues.apache.org/jira/browse/HDFS-17043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17043. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > HttpFS implementation for getAllErasureCodingPolicies > - > > Key: HDFS-17043 > URL: https://issues.apache.org/jira/browse/HDFS-17043 > Project: Hadoop HDFS > Issue Type: Improvement > Components: httpfs >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Assignee: Hualong Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > HttpFS should support getAllErasureCodingPolicies API In order to be able to > retrieve all Erasure Coding Policies.. WebHdfs implementation available on > HDFS-17029. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17047) BlockManager#addStoredBlock should log storage id when AddBlockResult is REPLACED
[ https://issues.apache.org/jira/browse/HDFS-17047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17047. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > BlockManager#addStoredBlock should log storage id when AddBlockResult is > REPLACED > - > > Key: HDFS-17047 > URL: https://issues.apache.org/jira/browse/HDFS-17047 > Project: Hadoop HDFS > Issue Type: Wish >Reporter: farmmamba >Assignee: farmmamba >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > Recently, we found some logs in active namenode frequently like belows: > > {code:java} > 2023-06-12 05:34:09,821 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010 > 2023-06-12 05:34:09,892 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010 > 2023-06-12 11:34:07,932 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010 > 2023-06-12 11:34:08,027 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010 > 2023-06-12 17:34:08,742 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010 > 2023-06-12 17:34:08,813 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010 > 2023-06-12 23:34:09,752 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010 > 2023-06-12 23:34:09,812 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010 > 2023-06-13 05:34:08,065 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010 > 2023-06-13 05:34:08,144 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010 > 2023-06-13 11:34:08,638 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010 > 2023-06-13 11:34:08,681 WARN BlockStateChange: BLOCK* addStoredBlock: block > blk_-9223372036614126544_57136788 moved to storageType DISK on node > datanode1:50010{code} > > > All logs have the same ec block id : blk_-9223372036614126544_57136788 and > printed every 6 hours(FBR interval of our cluster). > To figure out what happened, I think we should also log storage id here. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16946) RBF: top real owners metrics can't been parsed json string
[ https://issues.apache.org/jira/browse/HDFS-16946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16946. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: top real owners metrics can't been parsed json string > -- > > Key: HDFS-16946 > URL: https://issues.apache.org/jira/browse/HDFS-16946 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Affects Versions: 3.4.0 >Reporter: Max Xie >Assignee: Nishtha Shah >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: image-2023-03-09-22-24-39-833.png > > > After HDFS-15447, Add top real owners metrics for delegation tokens. But the > metrics can't been parsed json string. > RBFMetrics$getTopTokenRealOwners method just return > `org.apache.hadoop.metrics2.util.Metrics2Util$NameValuePair@1` > !image-2023-03-09-22-24-39-833.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17035) FsVolumeImpl#getActualNonDfsUsed may return negative value
[ https://issues.apache.org/jira/browse/HDFS-17035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17035. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > FsVolumeImpl#getActualNonDfsUsed may return negative value > -- > > Key: HDFS-17035 > URL: https://issues.apache.org/jira/browse/HDFS-17035 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.4.0 >Reporter: farmmamba >Assignee: farmmamba >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17029) Support getECPolices API in WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-17029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17029. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Support getECPolices API in WebHDFS > --- > > Key: HDFS-17029 > URL: https://issues.apache.org/jira/browse/HDFS-17029 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Assignee: Hualong Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: image-2023-05-29-23-55-09-224.png > > > WebHDFS should support getEcPolicies: > !image-2023-05-29-23-55-09-224.png|width=817,height=234! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17028) RBF: Optimize debug logs of class ConnectionPool and other related class.
[ https://issues.apache.org/jira/browse/HDFS-17028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17028. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Optimize debug logs of class ConnectionPool and other related class. > - > > Key: HDFS-17028 > URL: https://issues.apache.org/jira/browse/HDFS-17028 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Affects Versions: 3.3.4 >Reporter: farmmamba >Assignee: farmmamba >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > When we change the log level of RouterRpcClient from INFO to DEBUG to figure > out which connection an user is using. We found logs below: > > {code:java} > 2023-05-29 09:46:09,033 DEBUG > org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient: User someone > NN ANN:8020 is using connection > ClientNamenodeProtocolTranslatorPB@ANN/ANN_IP:8020x3 > 2023-05-29 09:46:09,037 DEBUG > org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient: User someone > NN ANN:8020 is using connection > ClientNamenodeProtocolTranslatorPB@ANN/ANN_IP:8020x1 > 2023-05-29 09:46:09,037 DEBUG > org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient: User someone > NN ANN:8020 is using connection > ClientNamenodeProtocolTranslatorPB@ANN/ANN_IP:8020x2 > 2023-05-29 09:46:09,037 DEBUG > org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient: User someone > NN ANN:8020 is using connection > ClientNamenodeProtocolTranslatorPB@ANN/ANN_IP:8020x3 > 2023-05-29 09:46:09,042 DEBUG > org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient: User someone > NN ANN:8020 is using connection > ClientNamenodeProtocolTranslatorPB@ANN/ANN_IP:8020x0 {code} > It seems not very clear for us to figure out which connection user is using. > Therefore, i think we should optimize the toString method of class > ConnectionContext. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-17038) TestDirectoryScanner.testThrottle() is still a little flaky
Ayush Saxena created HDFS-17038: --- Summary: TestDirectoryScanner.testThrottle() is still a little flaky Key: HDFS-17038 URL: https://issues.apache.org/jira/browse/HDFS-17038 Project: Hadoop HDFS Issue Type: Bug Reporter: Ayush Saxena Failing every now and then {noformat} java.lang.AssertionError: Throttle is too permissive at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.assertTrue(Assert.java:42) at org.apache.hadoop.hdfs.server.datanode.TestDirectoryScanner.testThrottling(TestDirectoryScanner.java:789) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62){noformat} [https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1247/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDirectoryScanner/testThrottling/] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17031) RBF: Reduce repeated code in RouterRpcServer
[ https://issues.apache.org/jira/browse/HDFS-17031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17031. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Reduce repeated code in RouterRpcServer > > > Key: HDFS-17031 > URL: https://issues.apache.org/jira/browse/HDFS-17031 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Reporter: Chengwei Wang >Assignee: Chengwei Wang >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > Reduce repeated codes : > > {code:java} > if (subclusterResolver instanceof MountTableResolver) { > try { > MountTableResolver mountTable = (MountTableResolver)subclusterResolver; > MountTable entry = mountTable.getMountPoint(path); > // check logic > } catch (IOException e) { > LOG.error("Cannot get mount point", e); > } > } > return false; {code} > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16996) Fix flaky testFsCloseAfterClusterShutdown in TestFileCreation
[ https://issues.apache.org/jira/browse/HDFS-16996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16996. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix flaky testFsCloseAfterClusterShutdown in TestFileCreation > - > > Key: HDFS-16996 > URL: https://issues.apache.org/jira/browse/HDFS-16996 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Uma Maheswara Rao G >Assignee: Nishtha Shah >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > {code:java} > [ERROR] > testFsCloseAfterClusterShutdown(org.apache.hadoop.hdfs.TestFileCreation) Time > elapsed: 1.725 s <<< FAILURE! java.lang.AssertionError: Test resulted in an > unexpected exit: 1: Block report processor encountered fatal exception: > java.lang.ClassCastException: org.apache.hadoop.fs.FsServerDefaults cannot be > cast to java.lang.Boolean at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:2166) at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:2152) at > org.apache.hadoop.hdfs.MiniDFSCluster.shutdown(MiniDFSCluster.java:2145) at > org.apache.hadoop.hdfs.TestFileCreation.testFsCloseAfterClusterShutdown(TestFileCreation.java:1198) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > Caused by: 1: Block report processor encountered fatal exception: > java.lang.ClassCastException: org.apache.hadoop.fs.FsServerDefaults cannot be > cast to java.lang.Boolean at > org.apache.hadoop.util.ExitUtil.terminate(ExitUtil.java:381) at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.run(BlockManager.java:5451){code} > https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5532/10/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17000) Potential infinite loop in TestDFSStripedOutputStreamUpdatePipeline.testDFSStripedOutputStreamUpdatePipeline
[ https://issues.apache.org/jira/browse/HDFS-17000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17000. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Potential infinite loop in > TestDFSStripedOutputStreamUpdatePipeline.testDFSStripedOutputStreamUpdatePipeline > > > Key: HDFS-17000 > URL: https://issues.apache.org/jira/browse/HDFS-17000 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Marcono1234 >Assignee: Marcono1234 >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > The method > {{TestDFSStripedOutputStreamUpdatePipeline.testDFSStripedOutputStreamUpdatePipeline}} > contains the following line: > {code} > for (int i = 0; i < Long.MAX_VALUE; i++) { > {code} > [GitHub source > link|https://github.com/apache/hadoop/blob/4ee92efb73a90ae7f909e96de242d216ad6878b2/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSStripedOutputStreamUpdatePipeline.java#L48] > Because {{i}} is an {{int}} the condition {{i < Long.MAX_VALUE}} will always > be true and {{i}} will simply overflow. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16908) Fix javadoc of field IncrementalBlockReportManager#readyToSend.
[ https://issues.apache.org/jira/browse/HDFS-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16908. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix javadoc of field IncrementalBlockReportManager#readyToSend. > --- > > Key: HDFS-16908 > URL: https://issues.apache.org/jira/browse/HDFS-16908 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 3.3.4 >Reporter: farmmamba >Assignee: farmmamba >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > fix javadoc of field IncrementalBlockReportManager#readyToSend. > in sendImmediately(), readyToSend will be used with {{monotonicNow() - > ibrInterval >= lastIBR}} condition. So, we should update the javadoc of it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16994) NumTimedOutPendingReconstructions metrics should not be accumulated
[ https://issues.apache.org/jira/browse/HDFS-16994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16994. - Resolution: Not A Problem > NumTimedOutPendingReconstructions metrics should not be accumulated > --- > > Key: HDFS-16994 > URL: https://issues.apache.org/jira/browse/HDFS-16994 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: farmmamba >Priority: Minor > Labels: pull-request-available > > For now, the NumTimedOutPendingReconstructions metric is computed by below > statement: > {code:java} > timedOutCount + timedOutItems.size(); {code} > Therefore, its value would always be accumulated unless namenode failover > happens. > In fact, the NumTimedOutPendingReconstructions metric should not be > accumulated, So, we should set it to zero after we execute getNumTimedOuts > method. > The UT TestPendingReconstruction#testPendingReconstruction was passed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17017) Fix the issue of arguments number limit in report command in DFSAdmin.
[ https://issues.apache.org/jira/browse/HDFS-17017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17017. - Fix Version/s: 3.4.0 3.3.9 Hadoop Flags: Reviewed Resolution: Fixed > Fix the issue of arguments number limit in report command in DFSAdmin. > -- > > Key: HDFS-17017 > URL: https://issues.apache.org/jira/browse/HDFS-17017 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haiyang Hu >Assignee: Haiyang Hu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > > Currently, the DFSAdmin report command should support a maximum number of > arguments of 7, such as : > hdfs dfsadmin [-report] [-live] [-dead] [-decommissioning] > [-enteringmaintenance] [-inmaintenance] [-slownodes] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17022) Fix the exception message to print the Identifier pattern
[ https://issues.apache.org/jira/browse/HDFS-17022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17022. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix the exception message to print the Identifier pattern > - > > Key: HDFS-17022 > URL: https://issues.apache.org/jira/browse/HDFS-17022 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Nishtha Shah >Assignee: Nishtha Shah >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > In case of an incorrect string passed as value, it would throw an exception, > but the message doesn't print the identifier pattern. > {code:java} > java.lang.IllegalArgumentException: [] = [[a] must be {2}{code} > instead of > {code:java} > java.lang.IllegalArgumentException: [] = [[a] must be > [a-zA-Z_][a-zA-Z0-9_\-]*{code} > Ref to original discussion: > https://github.com/apache/hadoop/pull/5669#discussion_r1198937053 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17014) HttpFS Add Support getStatus API
[ https://issues.apache.org/jira/browse/HDFS-17014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17014. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > HttpFS Add Support getStatus API > > > Key: HDFS-17014 > URL: https://issues.apache.org/jira/browse/HDFS-17014 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Assignee: Hualong Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > We should ensure that *WebHDFS* remains synchronized with {*}HttpFS{*}, as > the former has already implemented the *getStatus* interface. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16697) Add logs if resources are not available in NameNodeResourcePolicy
[ https://issues.apache.org/jira/browse/HDFS-16697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16697. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Add logs if resources are not available in NameNodeResourcePolicy > - > > Key: HDFS-16697 > URL: https://issues.apache.org/jira/browse/HDFS-16697 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Affects Versions: 3.1.3 > Environment: Linux version 4.15.0-142-generic > (buildd@lgw01-amd64-039) (gcc version 5.4.0 20160609 (Ubuntu > 5.4.0-6ubuntu1~16.04.12)) > java version "1.8.0_162" > Java(TM) SE Runtime Environment (build 1.8.0_162-b12) > Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, mixed mode) >Reporter: ECFuzz >Assignee: ECFuzz >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > {code:java} > > dfs.namenode.resource.checked.volumes.minimum > 1 > > The minimum number of redundant NameNode storage volumes required. > > {code} > I found that when setting the value of > “dfs.namenode.resource.checked.volumes.minimum” is greater than the total > number of storage volumes in the NameNode, it is always impossible to turn > off the safe mode, and when in safe mode, the file system only accepts read > data requests, but not delete, modify and other change requests, which is > greatly limited by the function. > The default value of the configuration item is 1, we set to 2 as an example > for illustration, after starting hdfs logs and the client will throw the > relevant reminders. > {code:java} > 2022-07-27 17:37:31,772 WARN > org.apache.hadoop.hdfs.server.namenode.FSNamesystem: NameNode low on > available disk space. Already in safe mode. > 2022-07-27 17:37:31,772 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe > mode is ON. > Resources are low on NN. Please add or free up more resourcesthen turn off > safe mode manually. NOTE: If you turn off safe mode before adding resources, > the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode > leave" to turn safe mode off. > {code} > {code:java} > org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create > directory /hdfsapi/test. Name node is in safe mode. > Resources are low on NN. Please add or free up more resourcesthen turn off > safe mode manually. NOTE: If you turn off safe mode before adding resources, > the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode > leave" to turn safe mode off. NamenodeHostName:192.168.1.167 > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1468) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1455) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3174) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1145) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:714) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:527) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1000) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928) > at java.base/java.security.AccessController.doPrivileged(Native > Method) > at java.base/javax.security.auth.Subject.doAs(Subject.java:423) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2916){code} > According to the prompt, it is believed that there is not enough resource > space to meet the corresponding conditions to close safe mode, but after > adding or releasing more resources and lowering the resource condition > threshold "dfs.namenode.resource.du.reserved", it still fails to close safe > mode and throws the same prompt . > According to the source code, we know that if the NameNode has redundant > storage volumes less than the "dfs.namenode.resource.checked.volumes.minimum" > set the minimum number of redundant storage volumes will enter safe mode. > After debugging, *we found that the current NameNode storage volumes are > abundant resource space, but because the total number
[jira] [Resolved] (HDFS-16653) Improve error messages in ShortCircuitCache
[ https://issues.apache.org/jira/browse/HDFS-16653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16653. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Improve error messages in ShortCircuitCache > --- > > Key: HDFS-16653 > URL: https://issues.apache.org/jira/browse/HDFS-16653 > Project: Hadoop HDFS > Issue Type: Improvement > Components: dfsadmin >Affects Versions: 3.1.3 > Environment: Linux version 4.15.0-142-generic > (buildd@lgw01-amd64-039) (gcc version 5.4.0 20160609 (Ubuntu > 5.4.0-6ubuntu1~16.04.12)) >Reporter: ECFuzz >Assignee: ECFuzz >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > > {code:java} > > dfs.client.mmap.cache.size > 256 > > When zero-copy reads are used, the DFSClient keeps a cache of recently > used > memory mapped regions. This parameter controls the maximum number of > entries that we will keep in that cache. > The larger this number is, the more file descriptors we will potentially > use for memory-mapped files. mmaped files also use virtual address space. > You may need to increase your ulimit virtual address space limits before > increasing the client mmap cache size. > > Note that you can still do zero-copy reads when this size is set to 0. > > > {code} > When the configuration item “dfs.client.mmap.cache.size” is set to a negative > number, it will cause /hadoop/bin hdfs dfsadmin -safemode provides all the > operation options including enter, leave, get, wait and forceExit are > invalid, the terminal returns security mode is null and no exceptions are > thrown. > In summary, I think we need to improve the check mechanism related to this > configuration item, *add maxEvictableMmapedSize that is > "dfs.client.mmap.cache.size" related Precondition check suite error > message,and give a clear indication when the configuration is abnormal in > order to solve the problem in time and reduce the impact on the safe mode > related operations.* > The details are as follows. > I think that since the constructor of the ShortCircuitCache class in > ShortCircuitCache.java in the source code already uses > Preconditions.checkArgument() to check whether the configuration item value > is greater than or equal to zero.So when set to a negative number, it will > lead to the creation of ShortCircuitCache class object in ClientContext.java > failed. > But due to Preconditions.checkArgument () in the lack of error information, > resulting in the terminal using hdfs dfsadmin script appears as follows: > {code:java} > hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode leave > safemode: null > Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit] > hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode enter > safemode: null > Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit] > hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode get > safemode: null > Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit] > hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode forceExit > safemode: null > Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]{code} > And hdfs logs and terminal are not related to the exception thrown. > Therefore, the cause of the situation can be found directly after adding an > error message to the original Preconditions.checkArgument(), as follows: > {code:java} > hadoop@ljq1:~/hadoop-3.1.3-work/sbin$ hdfs dfsadmin -safemode leave > safemode: Invalid argument: dfs.client.mmap.cache.size must be greater than > zero. > Usage: hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]{code} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17018) Improve dfsclient log format
[ https://issues.apache.org/jira/browse/HDFS-17018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17018. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Improve dfsclient log format > > > Key: HDFS-17018 > URL: https://issues.apache.org/jira/browse/HDFS-17018 > Project: Hadoop HDFS > Issue Type: Bug > Components: dfsclient >Affects Versions: 3.3.4 >Reporter: Xianming Lei >Assignee: Xianming Lei >Priority: Minor > Fix For: 3.4.0 > > > Modify the log format. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16979) RBF: Add dfsrouter port in hdfsauditlog
[ https://issues.apache.org/jira/browse/HDFS-16979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16979. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Add dfsrouter port in hdfsauditlog > --- > > Key: HDFS-16979 > URL: https://issues.apache.org/jira/browse/HDFS-16979 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: liuguanghua >Assignee: liuguanghua >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > > when client is using proxyuser via realuser, the hdfs aduilg log is lack of > dfsrouter port infomation. > client (using proxyuser)-> dfsrouter -> namenode > clientport dfsrouterport > hdfsauditlog should record dfsrouterport -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17012) Remove unused DFSConfigKeys#DFS_DATANODE_PMEM_CACHE_DIRS_DEFAULT
[ https://issues.apache.org/jira/browse/HDFS-17012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17012. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Remove unused DFSConfigKeys#DFS_DATANODE_PMEM_CACHE_DIRS_DEFAULT > > > Key: HDFS-17012 > URL: https://issues.apache.org/jira/browse/HDFS-17012 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, hdfs >Affects Versions: 3.3.4 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Minor > Fix For: 3.4.0 > > Attachments: screenshot-1.png > > > In DFSConfigKeys, DFS_DATANODE_PMEM_CACHE_DIRS_DEFAULT doesn't seem to have > been used anywhere, this is a redundant option and we should remove it. > !screenshot-1.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-17001) Support getStatus API in WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-17001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-17001. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Support getStatus API in WebHDFS > > > Key: HDFS-17001 > URL: https://issues.apache.org/jira/browse/HDFS-17001 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Assignee: Hualong Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: image-2023-05-08-14-34-51-873.png > > > WebHDFS should support getStatus: > !image-2023-05-08-14-34-51-873.png! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16965) Add switch to decide whether to enable native codec.
[ https://issues.apache.org/jira/browse/HDFS-16965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16965. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Add switch to decide whether to enable native codec. > > > Key: HDFS-16965 > URL: https://issues.apache.org/jira/browse/HDFS-16965 > Project: Hadoop HDFS > Issue Type: New Feature > Components: erasure-coding >Affects Versions: 3.3.4 >Reporter: WangYuanben >Assignee: WangYuanben >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > Sometimes we need to create codec without ISA-L, while priority is given to > native codec by default. So it is necessary to add switch to decide whether > to enable native codec. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16978) RBF: Admin command to support bulk add of mount points
[ https://issues.apache.org/jira/browse/HDFS-16978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16978. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Admin command to support bulk add of mount points > -- > > Key: HDFS-16978 > URL: https://issues.apache.org/jira/browse/HDFS-16978 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > All state store implementations support adding multiple state store records > using single putAll() implementation. We should provide new router admin API > to support bulk addition of mount table entries that can utilize this build > add implementation at state store level. > For more than one mount point to be added, the goal of bulk addition should be > # To reduce frequent router calls > # To avoid frequent state store cache refreshers with each single mount > point addition -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16990) HttpFS Add Support getFileLinkStatus API
[ https://issues.apache.org/jira/browse/HDFS-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16990. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > HttpFS Add Support getFileLinkStatus API > > > Key: HDFS-16990 > URL: https://issues.apache.org/jira/browse/HDFS-16990 > Project: Hadoop HDFS > Issue Type: Improvement > Components: httpfs >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Assignee: Hualong Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > HttpFS should implement the *getFileLinkStatus* API already implemented in > WebHDFS. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16880) modify invokeSingleXXX interface in order to pass actual file src to namenode for debug info.
[ https://issues.apache.org/jira/browse/HDFS-16880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16880. - Resolution: Duplicate > modify invokeSingleXXX interface in order to pass actual file src to namenode > for debug info. > - > > Key: HDFS-16880 > URL: https://issues.apache.org/jira/browse/HDFS-16880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: rbf >Affects Versions: 3.3.4 >Reporter: farmmamba >Priority: Major > Labels: pull-request-available > > We found lots of INFO level log like below: > {quote}2022-12-30 15:31:04,169 INFO org.apache.hadoop.hdfs.StateChange: DIR* > completeFile: / is closed by > DFSClient_attempt_1671783180362_213003_m_77_0_1102875551_1 > 2022-12-30 15:31:04,186 INFO org.apache.hadoop.hdfs.StateChange: DIR* > completeFile: / is closed by DFSClient_NONMAPREDUCE_1198313144_27480 > {quote} > It lost the real path of completeFile. Actually this is caused by : > > *org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient#invokeSingle(java.lang.String, > org.apache.hadoop.hdfs.server.federation.router.RemoteMethod)* > In this method, it instantiates a RemoteLocationContext object: > *RemoteLocationContext loc = new RemoteLocation(nsId, "/", "/");* > and then execute: *Object[] params = method.getParams(loc);* > The problem is right here, becasuse we always use new RemoteParam(), so, > context.getDest() always return "/"; That's why we saw lots of incorrect logs. > > After diving into invokeSingleXXX source code, I found the following RPCs > classified as need actual src and not need actual src. > > *need src path RPC:* > addBlock、abandonBlock、getAdditionalDatanode、complete > *not need src path RPC:* > updateBlockForPipeline、reportBadBlocks、getBlocks、updatePipeline、invokeAtAvailableNs(invoked > by: > getServerDefaults、getBlockKeys、getTransactionID、getMostRecentCheckpointTxId、versionRequest、getStoragePolicies) > > After changes, the src can be pass to NN correctly. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16865) RBF: The source path is always / after RBF proxied the complete, addBlock and getAdditionalDatanode RPC.
[ https://issues.apache.org/jira/browse/HDFS-16865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16865. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: The source path is always / after RBF proxied the complete, addBlock and > getAdditionalDatanode RPC. > > > Key: HDFS-16865 > URL: https://issues.apache.org/jira/browse/HDFS-16865 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > The source path is always / after RBF proxied the complete, addBlock and > getAdditionalDatanode RPC. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16707) RBF: Expose RouterRpcFairnessPolicyController related request record metrics for each nameservice to Prometheus
[ https://issues.apache.org/jira/browse/HDFS-16707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16707. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Expose RouterRpcFairnessPolicyController related request record metrics > for each nameservice to Prometheus > --- > > Key: HDFS-16707 > URL: https://issues.apache.org/jira/browse/HDFS-16707 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Jiale Qi >Assignee: Jiale Qi >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > HDFS-16302 intoduced request recored for each namespace, but it is only > exposed in /jmx endpoint and in json format, not very convenient. > this patch exposed these metrics in /prom endpoint for Prometheus -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16897) Fix abundant Broken pipe exception in BlockSender
[ https://issues.apache.org/jira/browse/HDFS-16897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16897. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix abundant Broken pipe exception in BlockSender > - > > Key: HDFS-16897 > URL: https://issues.apache.org/jira/browse/HDFS-16897 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 3.3.4 >Reporter: fanluo >Assignee: fanluo >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > in our production cluster env , we found some exception in datanode logs,its > frequently print below error > in HDFS-2054 we only ignored message starting with `Broken pipe` which may > not enough for the following case: > !https://user-images.githubusercontent.com/20748856/215264829-5f16dbc3-fea2-4883-a3d6-ded367564b8c.png! > this situation look like related to short-circuit read. in HDFS-4354 the > error has been wrapped, so that our previous judgment conditions are invalid. > !https://user-images.githubusercontent.com/20748856/215314257-2064637b-ea46-42f5-b53f-a29e68bb50ea.png! > maybe we can improve it. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16995) Remove unused parameters at NameNodeHttpServer#initWebHdfs
[ https://issues.apache.org/jira/browse/HDFS-16995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16995. - Fix Version/s: 3.4.0 Resolution: Fixed > Remove unused parameters at NameNodeHttpServer#initWebHdfs > -- > > Key: HDFS-16995 > URL: https://issues.apache.org/jira/browse/HDFS-16995 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Zhaohui Wang >Assignee: Zhaohui Wang >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16672) Fix lease interval comparison in BlockReportLeaseManager
[ https://issues.apache.org/jira/browse/HDFS-16672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16672. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix lease interval comparison in BlockReportLeaseManager > > > Key: HDFS-16672 > URL: https://issues.apache.org/jira/browse/HDFS-16672 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > monotonicNowMs is generated by System.nanoTime(), direct comparison is not > recommended. > > org.apache.hadoop.hdfs.server.blockmanagement.BlockReportLeaseManager#pruneIfExpired > {code:java} > if (monotonicNowMs < node.leaseTimeMs + leaseExpiryMs) { > return false; > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16981) Support getFileLinkStatus API in WebHDFS
[ https://issues.apache.org/jira/browse/HDFS-16981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16981. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Support getFileLinkStatus API in WebHDFS > > > Key: HDFS-16981 > URL: https://issues.apache.org/jira/browse/HDFS-16981 > Project: Hadoop HDFS > Issue Type: Improvement > Components: webhdfs >Affects Versions: 3.4.0 >Reporter: Hualong Zhang >Assignee: Hualong Zhang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: image-2023-04-13-23-41-51-380.png > > > WebHDFS should support getFileLinkStatus: > !image-2023-04-13-23-41-51-380.png|width=670,height=187! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16988) Improve NameServices info at JournalNode web UI
[ https://issues.apache.org/jira/browse/HDFS-16988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16988. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Improve NameServices info at JournalNode web UI > --- > > Key: HDFS-16988 > URL: https://issues.apache.org/jira/browse/HDFS-16988 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Zhaohui Wang >Assignee: Zhaohui Wang >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: Before.png, after.png > > > If the NameServices is named xxx-abc-edg, only xxx will be displayed on the > JN web UI. > If NS1 is named xxx-abc-edg, NS2 is named xxx-lmn-xyz. Show both NS as xxx on > the JN web UI. > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16954) RBF: The operation of renaming a multi-subcluster directory to a single-cluster directory should throw ioexception
[ https://issues.apache.org/jira/browse/HDFS-16954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16954. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: The operation of renaming a multi-subcluster directory to a > single-cluster directory should throw ioexception > -- > > Key: HDFS-16954 > URL: https://issues.apache.org/jira/browse/HDFS-16954 > Project: Hadoop HDFS > Issue Type: Bug > Components: rbf >Affects Versions: 3.4.0 >Reporter: Max Xie >Assignee: Max Xie >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > > The operation of renaming a multi-subcluster directory to a single-cluster > directory may cause inconsistent behavior of the file system. This operation > should throw exception to be reasonable. > Examples are as follows: > 1. add hash_all mount point `hdfs dfsrouteradmin -add /tmp/foo > subcluster1,subcluster2 /tmp/foo -order HASH_ALL` > 2. add mount point `hdfs dfsrouteradmin -add /user/foo subcluster1 > /user/foo` > 3. mkdir dir for all subcluster. ` hdfs dfs -mkdir /tmp/foo/123 ` > 4. check dir and all subclusters will have dir `/tmp/foo/123` > `hdfs dfs -ls /tmp/foo/` : will show dir `/tmp/foo/123`; > `hdfs dfs -ls hdfs://subcluster1/tmp/foo/` : will show dir > `hdfs://subcluster1/tmp/foo/123`; > `hdfs dfs -ls hdfs://subcluster2/tmp/foo/` : will show dir > `hdfs://subcluster2/tmp/foo/123`; > 5. rename `/tmp/foo/123` to `/user/foo/123`. The op will succeed. `hdfs dfs > -mv /tmp/foo/123 /user/foo/123 ` > 6. check dir again, rbf cluster still show dir `/tmp/foo/123` > `hdfs dfs -ls /tmp/foo/` : will show dir `/tmp/foo/123`; > `hdfs dfs -ls hdfs://subcluster1/tmp/foo/` : will no dirs; > `hdfs dfs -ls hdfs://subcluster2/tmp/foo/` : will show dir > `hdfs://subcluster2/tmp/foo/123`; > The step 5 should throw exception. > > > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16907) Add LastHeartbeatResponseTime for BP service actor
[ https://issues.apache.org/jira/browse/HDFS-16907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16907. - Fix Version/s: 3.4.0 3.3.9 Resolution: Fixed > Add LastHeartbeatResponseTime for BP service actor > -- > > Key: HDFS-16907 > URL: https://issues.apache.org/jira/browse/HDFS-16907 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9 > > Attachments: Screenshot 2023-02-03 at 6.12.24 PM.png > > > BP service actor LastHeartbeat is not sufficient to track realtime connection > breaks. > Each BP service actor thread maintains _lastHeartbeatTime_ with the namenode > that it is connected to. However, this is updated even if the connection to > the namenode is broken. > Suppose, the actor thread keeps heartbeating to namenode and suddenly the > socket connection is broken. When this happens, until specific time duration, > the actor thread consistently keeps updating _lastHeartbeatTime_ before even > initiating heartbeat connection with namenode. If connection cannot be > established even after RPC retries are exhausted, then IOException is thrown. > This means that heartbeat response has not been received from the namenode. > In the loop, the actor thread keeps trying connecting for heartbeat and the > last heartbeat stays close to 1/2s even though in reality there is no > response being received from namenode. > > Sample Exception from the BP service actor thread, during which LastHeartbeat > stays very low: > {code:java} > 2023-02-03 22:34:55,725 WARN [xyz:9000] datanode.DataNode - IOException in > offerService > java.io.EOFException: End of File Exception between local host is: "dn-0"; > destination host is: "nn-1":9000; : java.io.EOFException; For more details > see: http://wiki.apache.org/hadoop/EOFException > at sun.reflect.GeneratedConstructorAccessor34.newInstance(Unknown Source) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:913) > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:862) > at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1553) > at org.apache.hadoop.ipc.Client.call(Client.java:1495) > at org.apache.hadoop.ipc.Client.call(Client.java:1392) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:242) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:129) > at com.sun.proxy.$Proxy17.sendHeartbeat(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:168) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:544) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:682) > at > org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:890) > at java.lang.Thread.run(Thread.java:750) > Caused by: java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:392) > at org.apache.hadoop.ipc.Client$IpcStreams.readResponse(Client.java:1884) > at > org.apache.hadoop.ipc.Client$Connection.receiveRpcResponse(Client.java:1176) > at org.apache.hadoop.ipc.Client$Connection.run(Client.java:1074) {code} > Attaching screenshots of how last heartbeat value looks when the above error > is consistently getting logged. > > Last heartbeat response time is important to initiate any auto-recovery from > datanode. Hence, we should introduce LastHeartbeatResponseTime that only gets > updated if the BP service actor thread was successfully able to retrieve > response from namenode. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16883) Duplicate field name in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-16883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16883. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Duplicate field name in hdfs-default.xml > > > Key: HDFS-16883 > URL: https://issues.apache.org/jira/browse/HDFS-16883 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Reporter: YUBI LEE >Assignee: YUBI LEE >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: image-2023-01-04-10-02-16-881.png > > > {{"dfs.storage.policy.satisfier.enabled"}} and > "{{{}dfs.storage.policy.satisfier.mode"{}}} is specified in the same > `property` tag in hdfs-default.xml. > It should be separated. Because of this, on website, the description is wrong. > [https://hadoop.apache.org/docs/r3.3.4/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml] > !image-2023-01-04-10-02-16-881.png|width=1697,height=89! > {{"dfs.storage.policy.satisfier.enabled"}} is delete since > https://issues.apache.org/jira/browse/HDFS-13057. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16788) could only be written to 2 of the 3 required nodes for RS-3-2-1024k. There are 50 datanode(s) running and no node(s) are excluded in this operation
[ https://issues.apache.org/jira/browse/HDFS-16788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16788. - Resolution: Not A Bug Reach out to the user mailing list, Jira is for tracking bugs not for user queries https://hadoop.apache.org/mailing_lists.html > could only be written to 2 of the 3 required nodes for RS-3-2-1024k. There > are 50 datanode(s) running and no node(s) are excluded in this operation > --- > > Key: HDFS-16788 > URL: https://issues.apache.org/jira/browse/HDFS-16788 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs >Affects Versions: 3.1.0 >Reporter: ruiliang >Priority: Major > Attachments: image-2022-09-30-14-14-29-963.png, > image-2022-09-30-14-14-44-164.png > > > > !image-2022-09-30-14-14-44-164.png! > ||Configured Capacity:|3.02 PB| > ||Configured Remote Capacity:|0 B| > ||DFS Used:|1.39 PB (45.96%)| > ||Non DFS Used:|0 B| > ||DFS Remaining:|1.62 PB (53.67%)| > ||Block Pool Used:|1.39 PB (45.96%)| > ||DataNodes usages% (Min/Median/Max/stdDev):|8.20% / 32.44% / 98.85% / 37.30%| > ||[Live > Nodes|http://fs-hiido-yycluster06-yynn1.hiido.host.yydevops.com:50070/dfshealth.html#tab-datanode]|50 > (Decommissioned: 0, In Maintenance: 0) > | > I've been working hard in the background to balance the data, > but before I discp when > {code:java} > hdfs balancer -Ddfs.datanode.balance.max.concurrent.moves=300 > -Ddfs.balancer.moverThreads=1200 > -Ddfs.datanode.balance.bandwidthPerSec=1073741824 -fs hdfs://yycluster06 > -threshold 50 > {code} > {code:java} > hadoop distcp -Dmapreduce.task.timeout=60 -skipcrccheck -update hdfs://01 > hdfs://02xx > syslog > ... > 2022-09-30 14:22:50,724 WARN [main] org.apache.hadoop.hdfs.DFSOutputStream: > Cannot allocate parity block(index=4, policy=RS-3-2-1024k). Not enough > datanodes? Exclude nodes=[] 2022-09-30 14:22:58,389 INFO [main] > org.apache.hadoop.hdfs.DFSOutputStream: replacing previously failed streamer > #3: failed, blk_-9223372036808890525_3095130 2022-09-30 14:22:58,389 INFO > [main] org.apache.hadoop.hdfs.DFSOutputStream: replacing previously failed > streamer #4: failed, block==null 2022-09-30 14:23:21,547 WARN [main] > org.apache.hadoop.hdfs.DFSOutputStream: Cannot allocate parity block(index=4, > policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] 2022-09-30 > 14:23:29,319 INFO [main] org.apache.hadoop.hdfs.DFSOutputStream: replacing > previously failed streamer #4: failed, blk_-9223372036808889612_3095200 > 2022-09-30 14:23:36,950 WARN [main] org.apache.hadoop.hdfs.DFSOutputStream: > Cannot allocate parity block(index=4, policy=RS-3-2-1024k). Not enough > datanodes? Exclude nodes=[] 2022-09-30 14:23:44,822 INFO [main] > org.apache.hadoop.hdfs.DFSOutputStream: replacing previously failed streamer > #4: failed, blk_-922337203680572_3095307 2022-09-30 14:23:44,837 WARN > [main] org.apache.hadoop.hdfs.DFSOutputStream: Cannot allocate parity > block(index=4, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] > 2022-09-30 14:23:52,306 INFO [main] org.apache.hadoop.hdfs.DFSOutputStream: > replacing previously failed streamer #4: failed, block==null 2022-09-30 > 14:23:52,321 WARN [main] org.apache.hadoop.hdfs.DFSOutputStream: Cannot > allocate parity block(index=4, policy=RS-3-2-1024k). Not enough datanodes? > Exclude nodes=[] 2022-09-30 14:23:59,822 INFO [main] > org.apache.hadoop.hdfs.DFSOutputStream: replacing previously failed streamer > #4: failed, block==null 2022-09-30 14:23:59,836 WARN [main] > org.apache.hadoop.hdfs.DFSOutputStream: Cannot allocate parity block(index=3, > policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] 2022-09-30 > 14:23:59,836 WARN [main] org.apache.hadoop.hdfs.DFSOutputStream: Cannot > allocate parity block(index=4, policy=RS-3-2-1024k). Not enough datanodes? > Exclude nodes=[] 2022-09-30 14:24:07,302 INFO [main] > org.apache.hadoop.hdfs.DFSOutputStream: replacing previously failed streamer > #3: failed, blk_-9223372036808887853_3095387 2022-09-30 14:24:07,303 INFO > [main] org.apache.hadoop.hdfs.DFSOutputStream: replacing previously failed > streamer #4: failed, block==null 2022-09-30 14:24:07,317 WARN [main] > org.apache.hadoop.hdfs.DFSOutputStream: Cannot allocate parity block(index=4, > policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] 2022-09-30 > 14:24:15,383 INFO [main] org.apache.hadoop.hdfs.DFSOutputStream: replacing > previously failed streamer #4: failed, block==null 2022-09-30 14:24:15,395 > WARN [main] org.apache.hadoop.hdfs.DFSOutputStream: Cannot allocate parity > block(index=4, policy=RS-3-2-1024k). Not enough datanodes? Exclude nodes=[] >
[jira] [Resolved] (HDFS-16341) Fix BlockPlacementPolicy details in hdfs defaults
[ https://issues.apache.org/jira/browse/HDFS-16341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16341. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix BlockPlacementPolicy details in hdfs defaults > -- > > Key: HDFS-16341 > URL: https://issues.apache.org/jira/browse/HDFS-16341 > Project: Hadoop HDFS > Issue Type: Improvement > Components: documentation >Affects Versions: 3.3.1 >Reporter: guophilipse >Assignee: guophilipse >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Now, we have six block placement policies supported, we can keep the doc > updated. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16756) RBF proxies the client's user by the login user to enable CacheEntry
[ https://issues.apache.org/jira/browse/HDFS-16756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16756. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF proxies the client's user by the login user to enable CacheEntry > > > Key: HDFS-16756 > URL: https://issues.apache.org/jira/browse/HDFS-16756 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > RBF just proxies the client's user by the login user for Kerberos > authentication. If the cluster uses the SIMPLE authentication method, the RBF > will not proxies the client's user by the login user, the downstream > namespace will not use the real clientIp, clientPort, clientId and callId > even if the namenode configured dfs.namenode.ip-proxy-users. > > And the related code as bellow: > {code:java} > UserGroupInformation connUGI = ugi; > if (UserGroupInformation.isSecurityEnabled()) { > UserGroupInformation routerUser = UserGroupInformation.getLoginUser(); > connUGI = UserGroupInformation.createProxyUser( > ugi.getUserName(), routerUser); > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16748) RBF: DFSClient should uniquely identify writing files by namespace id and iNodeId
[ https://issues.apache.org/jira/browse/HDFS-16748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16748. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: DFSClient should uniquely identify writing files by namespace id and > iNodeId > - > > Key: HDFS-16748 > URL: https://issues.apache.org/jira/browse/HDFS-16748 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Critical > Labels: pull-request-available > Fix For: 3.4.0 > > > DFSClient should diff the writing files with namespaceId and iNodeId, because > the writing files may belongs to different namespace with the same iNodeId. > And the related code as bellows: > {code:java} > public void putFileBeingWritten(final long inodeId, > final DFSOutputStream out) { > synchronized(filesBeingWritten) { > filesBeingWritten.put(inodeId, out); > // update the last lease renewal time only when there was no > // writes. once there is one write stream open, the lease renewer > // thread keeps it updated well with in anyone's expiration time. > if (lastLeaseRenewal == 0) { > updateLastLeaseRenewal(); > } > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16728) RBF throw IndexOutOfBoundsException with disableNameServices
[ https://issues.apache.org/jira/browse/HDFS-16728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16728. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF throw IndexOutOfBoundsException with disableNameServices > > > Key: HDFS-16728 > URL: https://issues.apache.org/jira/browse/HDFS-16728 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > RBF will throw an IndexOutOfBoundsException when the namespace is disabled. > Suppose we have a mount point /a/b -> ns0 -> /a/b and we disabled the ns0. > RBF will throw IndexOutOfBoundsException during handling requests with path > starting with /a/b. > {code:java} > java.lang.IndexOutOfBoundsException: Index: 0, Size: 0 at > java.util.ArrayList.rangeCheck(ArrayList.java:657) > at java.util.ArrayList.get(ArrayList.java:433) > at > org.apache.hadoop.hdfs.server.federation.router.RouterClientProtocol.mkdirs(RouterClientProtocol.java:756) > at > org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.mkdirs(RouterRpcServer.java:980) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16723) Replace incorrect SafeModeException with StandbyException in RouterRpcServer.class
[ https://issues.apache.org/jira/browse/HDFS-16723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16723. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Replace incorrect SafeModeException with StandbyException in > RouterRpcServer.class > -- > > Key: HDFS-16723 > URL: https://issues.apache.org/jira/browse/HDFS-16723 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > > Incorrect code as below: > {code:java} > /** > * ... > * @throws SafeModeException If the Router is in safe mode and cannot serve > * client requests. > */ > void checkOperation(OperationCategory op) > throws StandbyException { > ... > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16709) Remove redundant cast in FSEditLogOp.class
[ https://issues.apache.org/jira/browse/HDFS-16709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16709. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Remove redundant cast in FSEditLogOp.class > -- > > Key: HDFS-16709 > URL: https://issues.apache.org/jira/browse/HDFS-16709 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 50m > Remaining Estimate: 0h > > When I read some class about Edits of NameNode, I found that there are much > redundant cast in FSEditLogOp.class, I feel that we should remove them. > Such as: > {code:java} > static UpdateBlocksOp getInstance(OpInstanceCache cache) { > return (UpdateBlocksOp)cache.get(OP_UPDATE_BLOCKS); > } {code} > Because cache.get() have cast the response to T, such as: > {code:java} > @SuppressWarnings("unchecked") > public T get(FSEditLogOpCodes opCode) { > return useCache ? (T)CACHE.get().get(opCode) : (T)newInstance(opCode); > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16712) Fix incorrect placeholder in DataNode.java
[ https://issues.apache.org/jira/browse/HDFS-16712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16712. - Fix Version/s: 3.4.0 3.3.9 3.2.5 Hadoop Flags: Reviewed Resolution: Fixed > Fix incorrect placeholder in DataNode.java > -- > > Key: HDFS-16712 > URL: https://issues.apache.org/jira/browse/HDFS-16712 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.9, 3.2.5 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Fix incorrect placeholder in DataNode.java > {code:java} > public String getDiskBalancerStatus() { > try { > return getDiskBalancer().queryWorkStatus().toJsonString(); > } catch (IOException ex) { > // incorrect placeholder > LOG.debug("Reading diskbalancer Status failed. ex:{}", ex); > return ""; > } > } {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16283) RBF: improve renewLease() to call only a specific NameNode rather than make fan-out calls
[ https://issues.apache.org/jira/browse/HDFS-16283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16283. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: improve renewLease() to call only a specific NameNode rather than make > fan-out calls > - > > Key: HDFS-16283 > URL: https://issues.apache.org/jira/browse/HDFS-16283 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: rbf >Reporter: Aihua Xu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: RBF_ improve renewLease() to call only a specific > NameNode rather than make fan-out calls.pdf > > Time Spent: 6h 40m > Remaining Estimate: 0h > > Currently renewLease() against a router will make fan-out to all the > NameNodes. Since renewLease() call is so frequent and if one of the NameNodes > are slow, then eventually the router queues are blocked by all renewLease() > and cause router degradation. > We will make a change in the client side to keep track of NameNode Id in > additional to current fileId so routers understand which NameNodes the client > is renewing lease against. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-14656) RBF: NPE in RBFMetrics
[ https://issues.apache.org/jira/browse/HDFS-14656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-14656. - Resolution: Duplicate > RBF: NPE in RBFMetrics > -- > > Key: HDFS-14656 > URL: https://issues.apache.org/jira/browse/HDFS-14656 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.federation.metrics.RBFMetrics.getActiveNamenodeRegistrations(RBFMetrics.java:726) > at > org.apache.hadoop.hdfs.server.federation.metrics.RBFMetrics.getNameserviceAggregatedInt(RBFMetrics.java:688) > at > org.apache.hadoop.hdfs.server.federation.metrics.RBFMetrics.getNumInMaintenanceDeadDataNodes(RBFMetrics.java:467) > at > org.apache.hadoop.hdfs.server.federation.metrics.NamenodeBeanMetrics.getNumInMaintenanceDeadDataNodes(NamenodeBeanMetrics.java:693) > at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71) > at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112) > at > com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46) > at > com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237) > at > com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83) > at > com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206) > at javax.management.StandardMBean.getAttribute(StandardMBean.java:372) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647) > ... 42 more > 2019-07-16 19:35:35,228 [qtp1811922029-78] ERROR jmx.JMXJsonServlet > (JMXJsonServlet.java:writeAttribute(345)) - getting attribute > NumEnteringMaintenanceDataNodes of > Hadoop:service=NameNode,name=FSNamesystem-3 threw an exception > javax.management.RuntimeMBeanException: java.lang.NullPointerException > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrow(DefaultMBeanServerInterceptor.java:839) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.rethrowMaybeMBeanException(DefaultMBeanServerInterceptor.java:852) > at > com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:651) > at > com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678) > at > org.apache.hadoop.jmx.JMXJsonServlet.writeAttribute(JMXJsonServlet.java:338) > at > org.apache.hadoop.jmx.JMXJsonServlet.listBeans(JMXJsonServlet.java:316) > at org.apache.hadoop.jmx.JMXJsonServlet.doGet(JMXJsonServlet.java:210) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:687) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) > at > org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:848) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:644) > at > org.apache.hadoop.security.authentication.server.ProxyUserAuthenticationFilter.doFilter(ProxyUserAuthenticationFilter.java:104) > at > org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592) > at org.apache.hadoop.hdfs.web.AuthFilter.doFilter(AuthFilter.java:51) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at > org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:110) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at > org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1604) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759) > at >
[jira] [Resolved] (HDFS-13576) RBF: Add destination path length validation for add/update mount entry
[ https://issues.apache.org/jira/browse/HDFS-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-13576. - Resolution: Duplicate > RBF: Add destination path length validation for add/update mount entry > -- > > Key: HDFS-13576 > URL: https://issues.apache.org/jira/browse/HDFS-13576 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Dibyendu Karmakar >Assignee: Ayush Saxena >Priority: Minor > > Currently there is no validation to check destination path length while > adding or updating mount entry. But while trying to create directory using > this mount entry > {noformat} > RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$PathComponentTooLongException){noformat} > is thrown with exception message as > {noformat} > "maximum path component name limit of ... directory / is > exceeded: limit=255 length=1817"{noformat} > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16638) Add isDebugEnabled check for debug blockLogs in BlockManager
[ https://issues.apache.org/jira/browse/HDFS-16638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16638. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Add isDebugEnabled check for debug blockLogs in BlockManager > > > Key: HDFS-16638 > URL: https://issues.apache.org/jira/browse/HDFS-16638 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: dzcxzl >Assignee: dzcxzl >Priority: Trivial > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > > There are lots of concatenating Strings using blockLog.debug in BlockManager. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16647) Delete unused NameNode#FS_HDFS_IMPL_KEY
[ https://issues.apache.org/jira/browse/HDFS-16647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16647. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Delete unused NameNode#FS_HDFS_IMPL_KEY > --- > > Key: HDFS-16647 > URL: https://issues.apache.org/jira/browse/HDFS-16647 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 3.3.3 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > > There's some history here, NameNode#FS_HDFS_IMPL_KEY was introduced in > HDFS-15450, and something was removed later in HDFS-15533, but > FS_HDFS_IMPL_KEY was kept. > Here are some discussion details: > https://github.com/apache/hadoop/pull/2229#discussion_r470935801 > It seems to be cleaner to remove the unused NameNode#FS_HDFS_IMPL_KEY. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16637) TestHDFSCLI#testAll consistently failing
[ https://issues.apache.org/jira/browse/HDFS-16637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16637. - Fix Version/s: 3.4.0 3.3.4 Hadoop Flags: Reviewed Resolution: Fixed > TestHDFSCLI#testAll consistently failing > > > Key: HDFS-16637 > URL: https://issues.apache.org/jira/browse/HDFS-16637 > Project: Hadoop HDFS > Issue Type: Test >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.3.4 > > Time Spent: 40m > Remaining Estimate: 0h > > The failure seems to have been caused by output change introduced by > HDFS-16581. > {code:java} > 2022-06-19 15:41:16,183 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(146)) - Detailed results: > 2022-06-19 15:41:16,184 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(147)) - > --2022-06-19 15:41:16,184 [Listener at > localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(156)) - > --- > 2022-06-19 15:41:16,184 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(157)) - Test ID: [629] > 2022-06-19 15:41:16,184 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(158)) - Test Description: > [printTopology: verifying that the topology map is what we expect] > 2022-06-19 15:41:16,184 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(159)) - > 2022-06-19 15:41:16,184 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(163)) - Test Commands: [-fs > hdfs://localhost:51486 -printTopology] > 2022-06-19 15:41:16,184 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(167)) - > 2022-06-19 15:41:16,184 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(174)) - > 2022-06-19 15:41:16,184 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(178)) - Comparator: > [RegexpAcrossOutputComparator] > 2022-06-19 15:41:16,184 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(180)) - Comparision result: > [fail] > 2022-06-19 15:41:16,184 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(182)) - Expected output: > [^Rack: > \/rack1\s*127\.0\.0\.1:\d+\s\([-.a-zA-Z0-9]+\)\s*127\.0\.0\.1:\d+\s\([-.a-zA-Z0-9]+\)] > 2022-06-19 15:41:16,185 [Listener at localhost/51519] INFO cli.CLITestHelper > (CLITestHelper.java:displayResults(184)) - Actual output: > [Rack: /rack1 > 127.0.0.1:51487 (localhost) In Service > 127.0.0.1:51491 (localhost) In ServiceRack: /rack2 > 127.0.0.1:51500 (localhost) In Service > 127.0.0.1:51496 (localhost) In Service > 127.0.0.1:51504 (localhost) In ServiceRack: /rack3 > 127.0.0.1:51508 (localhost) In ServiceRack: /rack4 > 127.0.0.1:51512 (localhost) In Service > 127.0.0.1:51516 (localhost) In Service] > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16621) Remove unused JNStorage#getCurrentDir()
[ https://issues.apache.org/jira/browse/HDFS-16621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16621. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Remove unused JNStorage#getCurrentDir() > --- > > Key: HDFS-16621 > URL: https://issues.apache.org/jira/browse/HDFS-16621 > Project: Hadoop HDFS > Issue Type: Improvement > Components: journal-node, qjm >Affects Versions: 3.3.0 >Reporter: JiangHua Zhu >Assignee: JiangHua Zhu >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1.5h > Remaining Estimate: 0h > > There is no use of getCurrentDir() anywhere in JNStorage, we should remove it. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16599) Fix typo in hadoop-hdfs-rbf module
[ https://issues.apache.org/jira/browse/HDFS-16599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16599. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Fix typo in hadoop-hdfs-rbf module > -- > > Key: HDFS-16599 > URL: https://issues.apache.org/jira/browse/HDFS-16599 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.4.0 >Reporter: fanshilun >Assignee: fanshilun >Priority: Minor > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15878) RBF: Fix TestRouterWebHDFSContractCreate#testSyncable
[ https://issues.apache.org/jira/browse/HDFS-15878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-15878. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Fix TestRouterWebHDFSContractCreate#testSyncable > - > > Key: HDFS-15878 > URL: https://issues.apache.org/jira/browse/HDFS-15878 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, rbf >Reporter: Renukaprasad C >Assignee: Hanley Yang >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 1h > Remaining Estimate: 0h > > ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: > 24.627 s <<< FAILURE! - in > org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate > [ERROR] > testSyncable(org.apache.hadoop.fs.contract.router.web.TestRouterWebHDFSContractCreate) > Time elapsed: 0.222 s <<< ERROR! > java.io.FileNotFoundException: File /test/testSyncable not found. > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:110) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.toIOException(WebHdfsFileSystem.java:576) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$900(WebHdfsFileSystem.java:146) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.shouldRetry(WebHdfsFileSystem.java:892) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:858) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:652) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:690) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:686) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.getRedirectedUrl(WebHdfsFileSystem.java:2307) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$ReadRunner.(WebHdfsFileSystem.java:2296) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem$WebHdfsInputStream.(WebHdfsFileSystem.java:2176) > at > org.apache.hadoop.hdfs.web.WebHdfsFileSystem.open(WebHdfsFileSystem.java:1610) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:975) > at > org.apache.hadoop.fs.contract.AbstractContractCreateTest.validateSyncableSemantics(AbstractContractCreateTest.java:556) > at > org.apache.hadoop.fs.contract.AbstractContractCreateTest.testSyncable(AbstractContractCreateTest.java:459) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File > /test/testSyncable not found. > at >
[jira] [Resolved] (HDFS-16587) Allow configuring Handler number for the JournalNodeRpcServer
[ https://issues.apache.org/jira/browse/HDFS-16587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16587. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Allow configuring Handler number for the JournalNodeRpcServer > - > > Key: HDFS-16587 > URL: https://issues.apache.org/jira/browse/HDFS-16587 > Project: Hadoop HDFS > Issue Type: Wish >Reporter: ZanderXu >Assignee: ZanderXu >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > We can allow configuring the handler number for the JournalNodeRpcServer. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15225) RBF: Add snapshot counts to content summary in router
[ https://issues.apache.org/jira/browse/HDFS-15225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-15225. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > RBF: Add snapshot counts to content summary in router > - > > Key: HDFS-15225 > URL: https://issues.apache.org/jira/browse/HDFS-15225 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Quan Li >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16548) Failed unit test testRenameMoreThanOnceAcrossSnapDirs_2
[ https://issues.apache.org/jira/browse/HDFS-16548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16548. - Resolution: Abandoned not a test issue, the prod code itself has issues, have reopened the original issue, we can chase there itself or revert the original jira > Failed unit test testRenameMoreThanOnceAcrossSnapDirs_2 > --- > > Key: HDFS-16548 > URL: https://issues.apache.org/jira/browse/HDFS-16548 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: tomscut >Priority: Major > > It seems to be related to HDFS-16531. > {code:java} > [ERROR] Tests run: 44, Failures: 6, Errors: 0, Skipped: 0, Time elapsed: > 143.701 s <<< FAILURE! - in > org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots > [ERROR] > testRenameMoreThanOnceAcrossSnapDirs_2(org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots) > Time elapsed: 6.606 s <<< FAILURE! > java.lang.AssertionError: expected:<3> but was:<1> > at org.junit.Assert.fail(Assert.java:89) > at org.junit.Assert.failNotEquals(Assert.java:835) > at org.junit.Assert.assertEquals(Assert.java:647) > at org.junit.Assert.assertEquals(Assert.java:633) > at > org.apache.hadoop.hdfs.server.namenode.snapshot.TestRenameWithSnapshots.testRenameMoreThanOnceAcrossSnapDirs_2(TestRenameWithSnapshots.java:985) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > at org.junit.runners.ParentRunner.run(ParentRunner.java:413) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Reopened] (HDFS-16531) Avoid setReplication logging an edit record if old replication equals the new value
[ https://issues.apache.org/jira/browse/HDFS-16531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena reopened HDFS-16531: - > Avoid setReplication logging an edit record if old replication equals the new > value > --- > > Key: HDFS-16531 > URL: https://issues.apache.org/jira/browse/HDFS-16531 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0, 3.2.4, 3.3.4 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > I recently came across a NN log where about 800k setRep calls were made, > setting the replication from 3 to 3 - ie leaving it unchanged. > Even in a case like this, we log an edit record, an audit log, and perform > some quota checks etc. > I believe it should be possible to avoid some of the work if we check for > oldRep == newRep and jump out of the method early. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16526) Add metrics for slow DataNode
[ https://issues.apache.org/jira/browse/HDFS-16526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HDFS-16526. - Fix Version/s: 3.4.0 Hadoop Flags: Reviewed Resolution: Fixed > Add metrics for slow DataNode > - > > Key: HDFS-16526 > URL: https://issues.apache.org/jira/browse/HDFS-16526 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Renukaprasad C >Assignee: Renukaprasad C >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: Metrics-html.png > > Time Spent: 2h 20m > Remaining Estimate: 0h > > Add some more metrics for slow datanode operations - FlushOrSync, > PacketResponder send ACK. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org