[jira] [Created] (HDDS-2480) Sonar : remove log spam for exceptions inside XceiverClientGrpc.reconnect
Supratim Deka created HDDS-2480: --- Summary: Sonar : remove log spam for exceptions inside XceiverClientGrpc.reconnect Key: HDDS-2480 URL: https://issues.apache.org/jira/browse/HDDS-2480 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar issue: https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsWE=AW5md_AGKcVY8lQ4ZsWE -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2479) Sonar : replace instanceof with catch block in XceiverClientGrpc.sendCommandWithRetry
Supratim Deka created HDDS-2479: --- Summary: Sonar : replace instanceof with catch block in XceiverClientGrpc.sendCommandWithRetry Key: HDDS-2479 URL: https://issues.apache.org/jira/browse/HDDS-2479 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar issue: https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV_=AW5md_AGKcVY8lQ4ZsV_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2478) Sonar : remove temporary variable in XceiverClientSpi.sendCommand
Supratim Deka created HDDS-2478: --- Summary: Sonar : remove temporary variable in XceiverClientSpi.sendCommand Key: HDDS-2478 URL: https://issues.apache.org/jira/browse/HDDS-2478 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: SCM Reporter: Supratim Deka Assignee: Supratim Deka Sonar issue : https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md_AGKcVY8lQ4ZsV1=AW5md_AGKcVY8lQ4ZsV1 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2383) Closing open container via SCMCli throws exception
[ https://issues.apache.org/jira/browse/HDDS-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar resolved HDDS-2383. --- Resolution: Duplicate > Closing open container via SCMCli throws exception > -- > > Key: HDDS-2383 > URL: https://issues.apache.org/jira/browse/HDDS-2383 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Rajesh Balamohan >Assignee: Nanda kumar >Priority: Major > > This was observed in apache master branch. > Closing the container via {{SCMCli}} throws the following exception, though > the container ends up getting closed eventually. > {noformat} > 2019-10-30 02:44:41,794 INFO > org.apache.hadoop.hdds.scm.block.SCMBlockDeletingService: Block deletion > txnID mismatch in datanode 79626ba3-1957-46e5-a8b0-32d7f47fb801 for > containerID 6. Datanode delete txnID: 0, SCM txnID: 1004 > 2019-10-30 02:44:41,810 INFO > org.apache.hadoop.hdds.scm.container.IncrementalContainerReportHandler: > Moving container #4 to CLOSED state, datanode > 8885d4ba-228a-4fd2-bf5a-831f01594c6c{ip: 10.17.234.37, host: > vd1327.halxg.cloudera.com, networkLocation: /default-rack, certSerialId: > null} reported CLOSED replica. > 2019-10-30 02:44:41,826 INFO > org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer: Object type > container id 4 op close new stage complete > 2019-10-30 02:44:41,826 ERROR > org.apache.hadoop.hdds.scm.container.ContainerStateManager: Failed to update > container state #4, reason: invalid state transition from state: CLOSED upon > event: CLOSE. > 2019-10-30 02:44:41,826 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 6 on 9860, call Call#3 Retry#0 > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocol.submitRequest > from 10.17.234.32:45926 > org.apache.hadoop.hdds.scm.exceptions.SCMException: Failed to update > container state #4, reason: invalid state transition from state: CLOSED upon > event: CLOSE. > at > org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:338) > at > org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:326) > at > org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.notifyObjectStageChange(SCMClientProtocolServer.java:388) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.notifyObjectStageChange(StorageContainerLocationProtocolServerSideTranslatorPB.java:303) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.processRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:158) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB$$Lambda$152/2036820231.apply(Unknown > Source) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.submitRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:112) > at > org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:30454) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-14988) HDFS should avoid read/write data from slow disks.
yimeng created HDFS-14988: - Summary: HDFS should avoid read/write data from slow disks. Key: HDFS-14988 URL: https://issues.apache.org/jira/browse/HDFS-14988 Project: Hadoop HDFS Issue Type: Improvement Components: block placement, datanode Affects Versions: 3.2.1, 3.1.1 Reporter: yimeng Slow disk causes real-time service(such as HBase ) become slowdown. The slow disk detection is added to the HDFS-11461, but only the slow disk is recorded in the Metric. I hope to further handle the detected slow disk. In my view, the slow disk can be added to the read data policy. If the block is on the slow disk of the DataNode, the block of other DataNodes is selected. For write data, slow disks can be added to the data write policy. We can remove the slow disk from all disks and then select a disk to write data based on dfs.datanode.fsdataset.volume.choosing.policy. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2308) Switch to centos with the apache/ozone-build docker image
[ https://issues.apache.org/jira/browse/HDDS-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2308. Fix Version/s: 0.5.0 Resolution: Fixed Committed to the build branch. > Switch to centos with the apache/ozone-build docker image > - > > Key: HDDS-2308 > URL: https://issues.apache.org/jira/browse/HDDS-2308 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Attachments: hs_err_pid16346.log > > Time Spent: 20m > Remaining Estimate: 0h > > I realized multiple JVM crashes in the daily builds: > > {code:java} > ERROR] ExecutionException The forked VM terminated without properly saying > goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp > surefire_947955725320624341206tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] Crashed tests: > > > [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractRename > > > [ERROR] ExecutionException The forked VM terminated without properly > saying goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter5429192218879128313.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7227403571189445391tmp > surefire_1011197392458143645283tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] Crashed tests: > > > [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractDistCp > > > [ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter1355604543311368443.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire3938612864214747736tmp > surefire_933162535733309260236tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] ExecutionException The forked VM terminated without properly > saying goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp > surefire_947955725320624341206tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 {code} > > Based on the crash log (uploaded) it's related to the rocksdb JNI interface. > In the current ozone-build docker image (which provides the environment for > build) we use alpine where musl libc is used instead of the main glibc. I > think it would be more safe to use the same glibc what is used in production. > I tested with centos based docker image and it seems to be more stable. > Didn't see any more JVM crashes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2477) TableCache cleanup issue for OM non-HA
Bharat Viswanadham created HDDS-2477: Summary: TableCache cleanup issue for OM non-HA Key: HDDS-2477 URL: https://issues.apache.org/jira/browse/HDDS-2477 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Bharat Viswanadham Assignee: Bharat Viswanadham In OM in non-HA case, the ratisTransactionLogIndex is generated by OmProtocolServersideTranslatorPB.java. And in OM non-HA validateAndUpdateCache is called from multipleHandler threads. So think of a case where one thread which has an index - 10 has added to doubleBuffer. (0-9 still have not added). DoubleBuffer flush thread flushes and call cleanup. (So, now cleanup will go and cleanup all cache entries with less than 10 epoch) This should not have cleanup those which might have put in to cache later and which are in process of flush to DB. This will cause inconsitency for few OM requests. Example: 4 threads Committing 4 parts. 1st thread - part 1 - ratis Index - 3 2nd thread - part 2 - ratis index - 2 3rd thread - part3 - ratis index - 1 First thread got lock, and put in to doubleBuffer and cache with OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread might cleanup those entries, as it is called with index 3 for cleanup. Now when the 4th part upload came -> when it is commit Multipart upload when it gets multipartinfo it get Only part1 in OmMultipartInfo, as the OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now after 4th part upload is complete in DB and Cache we will have 1,4 parts only. We will miss part2,3 information. So for non-HA case cleanup will be called with list of epochs that need to be cleanedup. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2476) Share more code between metadata and data scanners
Attila Doroszlai created HDDS-2476: -- Summary: Share more code between metadata and data scanners Key: HDDS-2476 URL: https://issues.apache.org/jira/browse/HDDS-2476 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Datanode Reporter: Attila Doroszlai There are several duplicated / similar pieces of code in metadata and data scanners. More code should be reused. Examples: # ContainerDataScrubberMetrics and ContainerMetadataScrubberMetrics have 3 common metrics # lifecycle of ContainerMetadataScanner and ContainerDataScanner (main loop, iteration, metrics processing, shutdown) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2475) Unregister ContainerMetadataScrubberMetrics on thread exit
Attila Doroszlai created HDDS-2475: -- Summary: Unregister ContainerMetadataScrubberMetrics on thread exit Key: HDDS-2475 URL: https://issues.apache.org/jira/browse/HDDS-2475 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Datanode Reporter: Attila Doroszlai {{ContainerMetadataScanner}} thread should call {{ContainerMetadataScrubberMetrics#unregister}} before exiting. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2474) Remove OzoneClient exception Precondition check
Hanisha Koneru created HDDS-2474: Summary: Remove OzoneClient exception Precondition check Key: HDDS-2474 URL: https://issues.apache.org/jira/browse/HDDS-2474 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Hanisha Koneru Assignee: Hanisha Koneru If RaftCleintReply encounters an exception other than NotLeaderException, NotReplicatedException, StateMachineException or LeaderNotReady, then it sets success to false but there is no exception set. This causes a Precondition check failure in XceiverClientRatis which expects that there should be an exception if success=false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1847) Datanode Kerberos principal and keytab config key looks inconsistent
[ https://issues.apache.org/jira/browse/HDDS-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-1847. Fix Version/s: 0.5.0 Resolution: Fixed [~chris.t...@gmail.com] Thanks for the contribution. [~elek] Thanks for retesting this patch. I have committed this change to the master branch. > Datanode Kerberos principal and keytab config key looks inconsistent > > > Key: HDDS-1847 > URL: https://issues.apache.org/jira/browse/HDDS-1847 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Eric Yang >Assignee: Chris Teoh >Priority: Major > Labels: newbie, pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Ozone Kerberos configuration can be very confusing: > | config name | Description | > | hdds.scm.kerberos.principal | SCM service principal | > | hdds.scm.kerberos.keytab.file | SCM service keytab file | > | ozone.om.kerberos.principal | Ozone Manager service principal | > | ozone.om.kerberos.keytab.file | Ozone Manager keytab file | > | hdds.scm.http.kerberos.principal | SCM service spnego principal | > | hdds.scm.http.kerberos.keytab.file | SCM service spnego keytab file | > | ozone.om.http.kerberos.principal | Ozone Manager spnego principal | > | ozone.om.http.kerberos.keytab.file | Ozone Manager spnego keytab file | > | hdds.datanode.http.kerberos.keytab | Datanode spnego keytab file | > | hdds.datanode.http.kerberos.principal | Datanode spnego principal | > | dfs.datanode.kerberos.principal | Datanode service principal | > | dfs.datanode.keytab.file | Datanode service keytab file | > The prefix are very different for each of the datanode configuration. It > would be nice to have some consistency for datanode. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2364) Add a OM metrics to find the false positive rate for the keyMayExist
[ https://issues.apache.org/jira/browse/HDDS-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2364. Fix Version/s: 0.5.0 Resolution: Fixed [~avijayan] Thanks for the contribution. [~bharat] Thanks for the reviews. I have committed this to the master branch. > Add a OM metrics to find the false positive rate for the keyMayExist > > > Key: HDDS-2364 > URL: https://issues.apache.org/jira/browse/HDDS-2364 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Mukul Kumar Singh >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Add a OM metrics to find the false positive rate for the keyMayExist. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.
Aravindan Vijayan created HDDS-2473: --- Summary: Fix code reliability issues found by Sonar in Ozone Recon module. Key: HDDS-2473 URL: https://issues.apache.org/jira/browse/HDDS-2473 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Recon Affects Versions: 0.5.0 Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Fix For: 0.5.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.
Aravindan Vijayan created HDDS-2472: --- Summary: Use try-with-resources while creating FlushOptions in RDBStore. Key: HDDS-2472 URL: https://issues.apache.org/jira/browse/HDDS-2472 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Affects Versions: 0.5.0 Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Fix For: 0.5.0 Link to the sonar issue flag - https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2412) Define description/topics/merge strategy for the github repository with .asf.yaml
[ https://issues.apache.org/jira/browse/HDDS-2412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2412. Fix Version/s: 0.5.0 Resolution: Fixed Thanks, I have committed this patch to the master. [~elek] Thanks for the contribution. [~adoroszlai] Thanks for the reviews. > Define description/topics/merge strategy for the github repository with > .asf.yaml > - > > Key: HDDS-2412 > URL: https://issues.apache.org/jira/browse/HDDS-2412 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > .asf.yaml helps to set different parameters on github repositories without > admin privileges: > [https://cwiki.apache.org/confluence/display/INFRA/.asf.yaml+features+for+git+repositories] > This basic .asf.yaml defines description/url/topics and the allowed merge > buttons. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2400) Enable github actions based builds for Ozone
[ https://issues.apache.org/jira/browse/HDDS-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anu Engineer resolved HDDS-2400. Fix Version/s: 0.5.0 Resolution: Fixed Thanks, Committed to the master. > Enable github actions based builds for Ozone > > > Key: HDDS-2400 > URL: https://issues.apache.org/jira/browse/HDDS-2400 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: build >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Current PR checks are executed in a private branch based on the scripts in > [https://github.com/elek/argo-ozone] > but the results are stored in a public repositories: > [https://github.com/elek/ozone-ci-q4|https://github.com/elek/ozone-ci-q3] > [https://github.com/elek/ozone-ci-03] > > As we discussed during the community calls, it would be great to use github > actions (or any other cloud based build) to make all the build definitions > more accessible for the community. > [~vivekratnavel] checked CircleCI which has better reporting capabilities. > But INFRA has concerns about the permission model of circle-ci: > {quote}it is highly unlikley we will allow a bot to be able to commit code > (whether or not that is the intention, allowing circle-ci will make this > possible, and is a complete no) > {quote} > See: > https://issues.apache.org/jira/browse/INFRA-18131 > [https://lists.apache.org/thread.html/af52e2a3e865c01596d46374e8b294f2740587dbd59d85e132429b6c@%3Cbuilds.apache.org%3E] > > Fortunately we have a clear contract. Or build scripts are stored under > _hadoop-ozone/dev-support/checks_ (return code show the result, details are > printed out to the console output). It's very easy to experiment with > different build systems. > > Github action seems to be an obvious choice: it's integrated well with GitHub > and it has more generous resource limitations. > > With this Jira I propose to enable github actions based PR checks for a few > tests (author, rat, unit, acceptance, checkstyle, findbugs) as an experiment. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2392) Fix TestScmSafeMode#testSCMSafeModeRestrictedOp
[ https://issues.apache.org/jira/browse/HDDS-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanisha Koneru resolved HDDS-2392. -- Resolution: Fixed Fixed by RATIS-747 > Fix TestScmSafeMode#testSCMSafeModeRestrictedOp > --- > > Key: HDDS-2392 > URL: https://issues.apache.org/jira/browse/HDDS-2392 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Blocker > > After ratis upgrade (HDDS-2340), TestScmSafeMode#testSCMSafeModeRestrictedOp > fails as the DNs fail to restart XceiverServerRatis. > RaftServer#start() fails with following exception: > {code:java} > java.io.IOException: java.lang.IllegalStateException: Not started > at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54) > at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61) > at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:284) > at > org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:296) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:421) > at > org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:215) > at > org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:110) > at > org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.IllegalStateException: Not started > at > org.apache.ratis.thirdparty.com.google.common.base.Preconditions.checkState(Preconditions.java:504) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.getPort(ServerImpl.java:176) > at > org.apache.ratis.grpc.server.GrpcService.lambda$new$2(GrpcService.java:143) > at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62) > at > org.apache.ratis.grpc.server.GrpcService.getInetSocketAddress(GrpcService.java:182) > at > org.apache.ratis.server.impl.RaftServerImpl.lambda$new$0(RaftServerImpl.java:84) > at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62) > at > org.apache.ratis.server.impl.RaftServerImpl.getPeer(RaftServerImpl.java:136) > at > org.apache.ratis.server.impl.RaftServerMetrics.(RaftServerMetrics.java:70) > at > org.apache.ratis.server.impl.RaftServerMetrics.getRaftServerMetrics(RaftServerMetrics.java:62) > at > org.apache.ratis.server.impl.RaftServerImpl.(RaftServerImpl.java:119) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2471) Improve exception message for CompleteMultipartUpload
Bharat Viswanadham created HDDS-2471: Summary: Improve exception message for CompleteMultipartUpload Key: HDDS-2471 URL: https://issues.apache.org/jira/browse/HDDS-2471 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Bharat Viswanadham When InvalidPart error occurs, the exception message does not have any information about partName and partNumber, it will be good to have this information. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload
Bharat Viswanadham created HDDS-2470: Summary: Add partName, partNumber for CommitMultipartUpload Key: HDDS-2470 URL: https://issues.apache.org/jira/browse/HDDS-2470 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Bharat Viswanadham Assignee: Bharat Viswanadham Right now when complete Multipart Upload is not printing partName and partNumber into the audit log. 2019-11-13 15:14:10,191 | INFO | OMAudit | user=root | ip=9.134.50.210 | op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID { containerBlockID { containerID: 2 localID: 103129366531867089 } blockCommitSequenceId: 4978 } offset: 0 length: 5242880 createVersion: 0 pipeline { leaderID: "" members { uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551" ipAddress: "9.134.51.232" hostName: "9.134.51.232" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551" networkLocation: "/default-rack" } members { uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d" ipAddress: "9.134.51.25" hostName: "9.134.51.25" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d" networkLocation: "/default-rack" } members { uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" ipAddress: "9.134.51.215" hostName: "9.134.51.215" ports { name: "RATIS" value: 9858 } ports { name: "STANDALONE" value: 9859 } networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03" networkLocation: "/default-rack" } state: PIPELINE_OPEN type: RATIS factor: THREE id { id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8" } } ]} | ret=SUCCESS | -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Today's Hadoop storage online community sync
Thanks again to Zhenyu for a great presentation. Posting my notes for future reference, and feel free to check out the presentation slides. TL;DR: Hadoop trunk builds successfully on ARM! There are test failures but it's great to see progress. If you want to test out, Linaro provides ARM VM instances for developers for free. Pretty cool! 11/13/2019 supporting ARM/aarch64 for Hadoop Attendee: Zhenyu, weichiu, Craig, Matt, Steven, Matt, Vinayakumar, Deveraj, kevin, Matthew Please find the presentation made by Zhenyu in the link: https://docs.google.com/presentation/d/1ASwKGwID3JEkKVClm-4-pR1UxEBRrrTiUkX3CxstfKY/edit?usp=sharing Steve L.: People need to care about nightly failures. Zhenyu: add ARM test build in nightly or periodically, and then precommit builds. On Fri, Nov 8, 2019 at 3:58 PM Wei-Chiu Chuang wrote: > Hi, > > I am happy to invite Zhenyu to join us to talk about the recent proposal > of supporting ARM/aarch64 for Hadoop. > > November 13 (Wednesday) US Pacific Time 10am / November 13 (Wednesday) > Bangalore 11:30pm) / November 14 (Thursday) Beijing 2am. > > Previous meeting notes: > > https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit > > Access via Zoom: > > https://cloudera.zoom.us/j/880548968 > > One tap mobile > > +16465588656,,880548968# US (New York) > > +17207072699,,880548968# US > > Dial by your location > > +1 646 558 8656 US (New York) > > +1 720 707 2699 US > > 877 853 5257 US Toll-free > > 888 475 4499 US Toll-free > > Meeting ID: 880 548 968 > Find your local number: https://zoom.us/u/acaGRDfMVl >
[jira] [Created] (HDDS-2469) Avoid changing client-side key metadata
Attila Doroszlai created HDDS-2469: -- Summary: Avoid changing client-side key metadata Key: HDDS-2469 URL: https://issues.apache.org/jira/browse/HDDS-2469 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Reporter: Attila Doroszlai Assignee: Attila Doroszlai Ozone RPC client should not change input map from client while creating keys. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Re: Next Wednesday (Nov 13) Hadoop storage online community sync
Just a reminder. This online sync is starting in 3 minutes. On Fri, Nov 8, 2019 at 3:58 PM Wei-Chiu Chuang wrote: > Hi, > > I am happy to invite Zhenyu to join us to talk about the recent proposal > of supporting ARM/aarch64 for Hadoop. > > November 13 (Wednesday) US Pacific Time 10am / November 13 (Wednesday) > Bangalore 11:30pm) / November 14 (Thursday) Beijing 2am. > > Previous meeting notes: > > https://docs.google.com/document/d/1jXM5Ujvf-zhcyw_5kiQVx6g-HeKe-YGnFS_1-qFXomI/edit > > Access via Zoom: > > https://cloudera.zoom.us/j/880548968 > > One tap mobile > > +16465588656,,880548968# US (New York) > > +17207072699,,880548968# US > > Dial by your location > > +1 646 558 8656 US (New York) > > +1 720 707 2699 US > > 877 853 5257 US Toll-free > > 888 475 4499 US Toll-free > > Meeting ID: 880 548 968 > Find your local number: https://zoom.us/u/acaGRDfMVl >
[jira] [Resolved] (HDDS-2463) Reduce unnecessary getServiceInfo calls
[ https://issues.apache.org/jira/browse/HDDS-2463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nanda kumar resolved HDDS-2463. --- Fix Version/s: 0.5.0 Resolution: Fixed > Reduce unnecessary getServiceInfo calls > --- > > Key: HDDS-2463 > URL: https://issues.apache.org/jira/browse/HDDS-2463 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.4.1 >Reporter: Xiaoyu Yao >Assignee: Xiaoyu Yao >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > OzoneManagerProtocolClientSideTranslatorPB.java Line 766-772 has multiple > impl.getServiceInfo() which can be reduced by adding a local variable. > {code:java} > > resp.addAllServiceInfo(impl.getServiceInfo().getServiceInfoList().stream() > .map(ServiceInfo::getProtobuf) > .collect(Collectors.toList())); > if (impl.getServiceInfo().getCaCertificate() != null) { > resp.setCaCertificate(impl.getServiceInfo().getCaCertificate()); {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2468) scmcli close pipeline command not working
Nanda kumar created HDDS-2468: - Summary: scmcli close pipeline command not working Key: HDDS-2468 URL: https://issues.apache.org/jira/browse/HDDS-2468 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Nanda kumar Assignee: Nanda kumar Close pipeline command is failing with the following exception {noformat} java.lang.IllegalArgumentException: Unknown command type: ClosePipeline at org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.processRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:219) at org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) at org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.submitRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:112) at org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:29883) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-14987) EC: EC file blockId location info displaying as "null" with hdfs fsck -blockId command
Souryakanta Dwivedy created HDFS-14987: -- Summary: EC: EC file blockId location info displaying as "null" with hdfs fsck -blockId command Key: HDFS-14987 URL: https://issues.apache.org/jira/browse/HDFS-14987 Project: Hadoop HDFS Issue Type: Bug Components: ec, tools Affects Versions: 3.1.2 Reporter: Souryakanta Dwivedy Attachments: EC_file_block_info.PNG, image-2019-11-13-18-34-00-067.png, image-2019-11-13-18-36-29-063.png, image-2019-11-13-18-38-18-899.png EC file blockId location info displaying as "null" with hdfs fsck -blockId command * Check the blockId information of an EC enabled file with "hdfs fsck -blockId" - Check the blockId information of an EC enabled file with "hdfs fsck -blockId" blockId location related info will display as null,which needs to be rectified. !image-2019-11-13-18-31-28-319.png! === !image-2019-11-13-18-34-00-067.png! * Check the output of a normal file block to compare !image-2019-11-13-18-36-29-063.png! === !image-2019-11-13-18-38-18-899.png! * Actual Output :- null * Expected output :- It should display the blockId location related info as (nodes, racks) of the block as specified in the usage info of fsck -blockId option. [like : Block replica on datanode/rack: BLR1xx038/default-rack is HEALTHY] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/ No changes -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml hadoop-tools/hadoop-azure/src/config/checkstyle-suppressions.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client Boxed value is unboxed and then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:[line 335] Failed junit tests : hadoop.util.TestReadWriteDiskValidator hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints hadoop.hdfs.TestDecommission hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints hadoop.registry.secure.TestSecureLogins hadoop.yarn.server.resourcemanager.TestLeaderElectorService hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 hadoop.yarn.client.api.impl.TestAMRMClient cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt [328K] cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/diff-compile-cc-root-jdk1.8.0_222.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/diff-compile-javac-root-jdk1.8.0_222.txt [308K] checkstyle: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/diff-checkstyle-root.txt [16M] hadolint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/diff-patch-hadolint.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/diff-patch-shellcheck.txt [72K] shelldocs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/diff-patch-shelldocs.txt [8.0K] whitespace: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/whitespace-eol.txt [12M] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/whitespace-tabs.txt [1.3M] xml: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/xml.txt [12K] findbugs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-timelineservice-hbase_hadoop-yarn-server-timelineservice-hbase-client-warnings.html [8.0K] javadoc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/diff-javadoc-javadoc-root-jdk1.7.0_95.txt [16K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/diff-javadoc-javadoc-root-jdk1.8.0_222.txt [1.1M] unit: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt [160K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [232K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/504/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt [12K]
[jira] [Created] (HDFS-14986) ReplicaCachingGetSpaceUsed throws ConcurrentModificationException
Ryan Wu created HDFS-14986: -- Summary: ReplicaCachingGetSpaceUsed throws ConcurrentModificationException Key: HDFS-14986 URL: https://issues.apache.org/jira/browse/HDFS-14986 Project: Hadoop HDFS Issue Type: Improvement Reporter: Ryan Wu Assignee: Ryan Wu Running DU across lots of disks is very expensive . We applied the patch HDFS-14313 to get used space from ReplicaInfo in memory.However, new du threads throw the exception {code:java} // 2019-11-08 18:07:13,858 ERROR [refreshUsed-/home/vipshop/hard_disk/7/dfs/dn/current/BP-1203969992-10.208.50.21-1450855658517] org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed: ReplicaCachingGetSpaceUsed refresh errorjava.util.ConcurrentModificationException: Tree has been modified outside of iteratorat org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.checkForModification(FoldedTreeSet.java:311) at org.apache.hadoop.hdfs.util.FoldedTreeSet$TreeSetIterator.hasNext(FoldedTreeSet.java:256) at java.util.AbstractCollection.addAll(AbstractCollection.java:343)at java.util.HashSet.(HashSet.java:120)at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.deepCopyReplica(FsDatasetImpl.java:1052) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ReplicaCachingGetSpaceUsed.refresh(ReplicaCachingGetSpaceUsed.java:73) at org.apache.hadoop.fs.CachingGetSpaceUsed$RefreshThread.run(CachingGetSpaceUsed.java:178) at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-14985) FSCK for a block of EC Files doesnt display status at the end
Ravuri Sushma sree created HDFS-14985: - Summary: FSCK for a block of EC Files doesnt display status at the end Key: HDFS-14985 URL: https://issues.apache.org/jira/browse/HDFS-14985 Project: Hadoop HDFS Issue Type: Bug Reporter: Ravuri Sushma sree FSCK of a blockid which belongs to an EC File doesnot print status at the end and displays null instead ./hdfs fsck -blockId blk_-x Connecting to namenode via FSCK started by root (auth:SIMPLE) from /x.x.x.x at Wed Nov 13 19:37:02 CST 2019 Block Id: blk_-x Block belongs to: /ecdir/f2 No. of Expected Replica: 3 No. of live Replica: 3 No. of excess Replica: 0 No. of stale Replica: 2 No. of decommissioned Replica: 0 No. of decommissioning Replica: 0 No. of corrupted Replica: 0 null -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2467) Allow running Freon validators with limited memory
Attila Doroszlai created HDDS-2467: -- Summary: Allow running Freon validators with limited memory Key: HDDS-2467 URL: https://issues.apache.org/jira/browse/HDDS-2467 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: freon Reporter: Attila Doroszlai Assignee: Attila Doroszlai Freon validators read each item to be validated completely into a {{byte[]}} buffer. This allows timing only the read (and buffer allocation), but not the subsequent digest calculation. However, it also means that memory required for running the validators is proportional to key size. I propose to add a command-line flag to allow calculating the digest while reading the input stream. This changes timing results a bit, since values will include the time required for digest calculation. On the other hand, Freon will be able to validate huge keys with limited memory. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-14984) HDFS setQuota: Error message should be added for invalid input max range value to hdfs dfsadmin -setQuota command
Souryakanta Dwivedy created HDFS-14984: -- Summary: HDFS setQuota: Error message should be added for invalid input max range value to hdfs dfsadmin -setQuota command Key: HDFS-14984 URL: https://issues.apache.org/jira/browse/HDFS-14984 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Affects Versions: 3.1.2 Reporter: Souryakanta Dwivedy Attachments: image-2019-11-13-14-05-19-603.png, image-2019-11-13-14-07-04-536.png An error message should be added for invalid input max range value "9223372036854775807" to hdfs dfsadmin -setQuota command * set quota for a directory with invalid input vlaue as "9223372036854775807"- set quota for a directory with invalid input vlaue as "9223372036854775807" the command will be successful without displaying any result.Quota value will not be set for the directory internally,but it will be better from user usage point of view if an error message will display for the invalid max range value "9223372036854775807" as it is displaying while setting the input value as "0" For example "hdfs dfsadmin -setQuota 9223372036854775807 /quota" !image-2019-11-13-14-05-19-603.png! * - Try to set quota for a directory with invalid input value as "0" It will throw an error message as "setQuota: Invalid values for quota : 0 and 9223372036854775807" For example "hdfs dfsadmin -setQuota 0 /quota" !image-2019-11-13-14-07-04-536.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org