[jira] [Created] (HDDS-2270) Avoid buffer copying in ContainerStateMachine.loadSnapshot/persistContainerSet
Tsz-wo Sze created HDDS-2270: Summary: Avoid buffer copying in ContainerStateMachine.loadSnapshot/persistContainerSet Key: HDDS-2270 URL: https://issues.apache.org/jira/browse/HDDS-2270 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Reporter: Tsz-wo Sze Assignee: Tsz-wo Sze ContainerStateMachine: - In loadSnapshot(..), it first reads the snapshotFile to a byte[] and then parses it to ContainerProtos.Container2BCSIDMapProto. The buffer copying can be avoided. {code} try (FileInputStream fin = new FileInputStream(snapshotFile)) { byte[] container2BCSIDData = IOUtils.toByteArray(fin); ContainerProtos.Container2BCSIDMapProto proto = ContainerProtos.Container2BCSIDMapProto .parseFrom(container2BCSIDData); ... } {code} - persistContainerSet(..) has similar problem. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2269) Provide config for fair/non-fair for OM RW Lock
Bharat Viswanadham created HDDS-2269: Summary: Provide config for fair/non-fair for OM RW Lock Key: HDDS-2269 URL: https://issues.apache.org/jira/browse/HDDS-2269 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Bharat Viswanadham Provide config in OzoneManager Lock for fair/non-fair for OM RW Lock. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2268) Incorrect container checksum upon downgrade
Attila Doroszlai created HDDS-2268: -- Summary: Incorrect container checksum upon downgrade Key: HDDS-2268 URL: https://issues.apache.org/jira/browse/HDDS-2268 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode, upgrade Reporter: Attila Doroszlai Container file checksum is calculated based on all YAML fields in a given Ozone version. If the same container file is used in older Ozone, which has fewer fields, the expected checksum will be different. Example: origin pipeline ID and origin node ID were added for HDDS-837 in Ozone 0.4.0. Starting Ozone 0.3.0 with the same data results in checksum error. {noformat} datanode_1 | ... ERROR ContainerReader:166 - Failed to parse ContainerFile for ContainerID: 1 datanode_1 | org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: Container checksum error for ContainerID: 1. datanode_1 | Stored Checksum: 7a6ec508d6e3796c5fe5fd52574b3d3437b0a0eaa4e053f7a96a5e39f4abb374 datanode_1 | Expected Checksum: fee023a02d3ced2f7b0b42c116cce5f03da6b57b29965ca878dc46d1213230b6 datanode_1 | at org.apache.hadoop.ozone.container.common.helpers.ContainerUtils.verifyChecksum(ContainerUtils.java:259) datanode_1 | at org.apache.hadoop.ozone.container.keyvalue.helpers.KeyValueContainerUtil.parseKVContainerData(KeyValueContainerUtil.java:165) datanode_1 | at org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyContainerData(ContainerReader.java:180) datanode_1 | at org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyContainerFile(ContainerReader.java:164) datanode_1 | at org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.readVolume(ContainerReader.java:142) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2244) Use new ReadWrite lock in OzoneManager
[ https://issues.apache.org/jira/browse/HDDS-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham resolved HDDS-2244. -- Resolution: Fixed > Use new ReadWrite lock in OzoneManager > -- > > Key: HDDS-2244 > URL: https://issues.apache.org/jira/browse/HDDS-2244 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Bharat Viswanadham >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Use new ReadWriteLock added in HDDS-2223. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2260) Avoid evaluation of LOG.trace and LOG.debug statement in the read/write path (HDDS)
[ https://issues.apache.org/jira/browse/HDDS-2260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham resolved HDDS-2260. -- Fix Version/s: 0.5.0 Resolution: Fixed > Avoid evaluation of LOG.trace and LOG.debug statement in the read/write path > (HDDS) > --- > > Key: HDDS-2260 > URL: https://issues.apache.org/jira/browse/HDDS-2260 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client, Ozone Datanode >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 4h 40m > Remaining Estimate: 0h > > LOG.trace and LOG.debug with logging information will be evaluated even when > debug/trace logging is disabled. This jira proposes to wrap all the > trace/debug logging with > LOG.isDebugEnabled and LOG.isTraceEnabled to prevent the logging. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-2267) Container metadata scanner interval mismatch
Attila Doroszlai created HDDS-2267: -- Summary: Container metadata scanner interval mismatch Key: HDDS-2267 URL: https://issues.apache.org/jira/browse/HDDS-2267 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Attila Doroszlai Assignee: Attila Doroszlai Container metadata scanner can be configured to run at specific time intervals, eg. hourly ({{hdds.containerscrub.metadata.scan.interval}}). However, the actual run interval does not match the configuration. After a datanode restart, it runs in quick succession, later it runs at apparently random intervals. {noformat:title=sample log} datanode_1 | 2019-10-08 14:05:30 INFO ContainerMetadataScanner:88 - Completed an iteration of container metadata scrubber in 0 minutes. Number of iterations (since the data-node restart) : 1, Number of containers scanned in this iteration : 0, Number of unhealthy containers found in this iteration : 0 datanode_1 | 2019-10-08 14:09:33 INFO ContainerMetadataScanner:88 - Completed an iteration of container metadata scrubber in 0 minutes. Number of iterations (since the data-node restart) : 1, Number of containers scanned in this iteration : 6, Number of unhealthy containers found in this iteration : 0 ... datanode_1 | 2019-10-08 14:09:33 INFO ContainerMetadataScanner:88 - Completed an iteration of container metadata scrubber in 0 minutes. Number of iterations (since the data-node restart) : 28, Number of containers scanned in this iteration : 6, Number of unhealthy containers found in this iteration : 0 datanode_1 | 2019-10-08 14:21:01 INFO ContainerMetadataScanner:88 - Completed an iteration of container metadata scrubber in 0 minutes. Number of iterations (since the data-node restart) : 29, Number of containers scanned in this iteration : 6, Number of unhealthy containers found in this iteration : 0 datanode_1 | 2019-10-08 14:21:01 INFO ContainerMetadataScanner:88 - Completed an iteration of container metadata scrubber in 0 minutes. Number of iterations (since the data-node restart) : 30, Number of containers scanned in this iteration : 6, Number of unhealthy containers found in this iteration : 0 datanode_1 | 2019-10-08 15:30:38 INFO ContainerMetadataScanner:88 - Completed an iteration of container metadata scrubber in 0 minutes. Number of iterations (since the data-node restart) : 31, Number of containers scanned in this iteration : 6, Number of unhealthy containers found in this iteration : 0 datanode_1 | 2019-10-08 16:45:01 INFO ContainerMetadataScanner:88 - Completed an iteration of container metadata scrubber in 0 minutes. Number of iterations (since the data-node restart) : 32, Number of containers scanned in this iteration : 6, Number of unhealthy containers found in this iteration : 0 {noformat} The problem is that time elapsed is measured in nanoseconds, while the configuration is in milliseconds. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1283/ [Oct 7, 2019 4:04:36 AM] (shashikant) HDDS-2169. Avoid buffer copies while submitting client requests in [Oct 7, 2019 7:38:08 AM] (aajisaka) HADOOP-16512. [hadoop-tools] Fix order of actual and expected expression [Oct 7, 2019 9:35:39 AM] (elek) HDDS-2252. Enable gdpr robot test in daily build [Oct 7, 2019 12:07:46 PM] (stevel) HADOOP-16587. Make ABFS AAD endpoints configurable. [Oct 7, 2019 5:17:25 PM] (bharat) HDDS-2239. Fix TestOzoneFsHAUrls (#1600) [Oct 7, 2019 6:44:30 PM] (surendralilhore) HDFS-14373. EC : Decoding is failing when block group last incomplete [Oct 7, 2019 8:59:49 PM] (aengineer) HDDS-2238. Container Data Scrubber spams log in empty cluster [Oct 7, 2019 9:10:57 PM] (aengineer) HDDS-2264. Improve output of TestOzoneContainer [Oct 7, 2019 9:30:23 PM] (aengineer) HDDS-2259. Container Data Scrubber computes wrong checksum [Oct 7, 2019 9:38:54 PM] (aengineer) HDDS-2262. SLEEP_SECONDS: command not found [Oct 7, 2019 10:41:42 PM] (aengineer) HDDS-2245. Use dynamic ports for SCM in TestSecureOzoneCluster -1 overall The following subsystems voted -1: asflicense compile findbugs hadolint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-mawo/hadoop-yarn-applications-mawo-core Class org.apache.hadoop.applications.mawo.server.common.TaskStatus implements Cloneable but does not define or use clone method At TaskStatus.java:does not define or use clone method At TaskStatus.java:[lines 39-346] Equals method for org.apache.hadoop.applications.mawo.server.worker.WorkerId assumes the argument is of type WorkerId At WorkerId.java:the argument is of type WorkerId At WorkerId.java:[line 114] org.apache.hadoop.applications.mawo.server.worker.WorkerId.equals(Object) does not check for null argument At WorkerId.java:null argument At WorkerId.java:[lines 114-115] FindBugs : module:hadoop-cloud-storage-project/hadoop-cos Redundant nullcheck of dir, which is known to be non-null in org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at BufferPool.java:is known to be non-null in org.apache.hadoop.fs.cosn.BufferPool.createDir(String) Redundant null check at BufferPool.java:[line 66] org.apache.hadoop.fs.cosn.CosNInputStream$ReadBuffer.getBuffer() may expose internal representation by returning CosNInputStream$ReadBuffer.buffer At CosNInputStream.java:by returning CosNInputStream$ReadBuffer.buffer At CosNInputStream.java:[line 87] Found reliance on default encoding in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, File, byte[]):in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFile(String, File, byte[]): new String(byte[]) At CosNativeFileSystemStore.java:[line 199] Found reliance on default encoding in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, InputStream, byte[], long):in org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.storeFileWithRetry(String, InputStream, byte[], long): new String(byte[]) At CosNativeFileSystemStore.java:[line 178] org.apache.hadoop.fs.cosn.CosNativeFileSystemStore.uploadPart(File, String, String, int) may fail to clean up java.io.InputStream Obligation to clean up resource created at CosNativeFileSystemStore.java:fail to clean up java.io.InputStream Obligation to clean up resource created at CosNativeFileSystemStore.java:[line 252] is not discharged FindBugs : module:hadoop-ozone/csi Useless control flow in csi.v1.Csi$CapacityRange$Builder.maybeForceBuilderInitialization() At Csi.java: At Csi.java:[line 15977] Class csi.v1.Csi$ControllerExpandVolumeRequest defines non-transient non-serializable instance field secrets_ In Csi.java:instance field secrets_ In Csi.java Useless control flow in csi.v1.Csi$ControllerExpandVolumeRequest$Builder.maybeForceBuilderInitialization() At Csi.java: At Csi.java:[line
[jira] [Created] (HDFS-14902) NullPointer When Misconfigured
David Mollitor created HDFS-14902: - Summary: NullPointer When Misconfigured Key: HDFS-14902 URL: https://issues.apache.org/jira/browse/HDFS-14902 Project: Hadoop HDFS Issue Type: Improvement Components: rbf Affects Versions: 3.2.0 Reporter: David Mollitor Admittedly the server was mis-configured, but this should be a bit more elegant. {code:none} 2019-10-08 11:19:52,505 ERROR router.NamenodeHeartbeatService: Unhandled exception updating NN registration for null:null java.lang.NullPointerException at org.apache.hadoop.hdfs.federation.protocol.proto.HdfsServerFederationProtos$NamenodeMembershipRecordProto$Builder.setServiceAddress(HdfsServerFederationProtos.java:3831) at org.apache.hadoop.hdfs.server.federation.store.records.impl.pb.MembershipStatePBImpl.setServiceAddress(MembershipStatePBImpl.java:119) at org.apache.hadoop.hdfs.server.federation.store.records.MembershipState.newInstance(MembershipState.java:108) at org.apache.hadoop.hdfs.server.federation.resolver.MembershipNamenodeResolver.registerNamenode(MembershipNamenodeResolver.java:259) at org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.updateState(NamenodeHeartbeatService.java:223) at org.apache.hadoop.hdfs.server.federation.router.NamenodeHeartbeatService.periodicInvoke(NamenodeHeartbeatService.java:159) at org.apache.hadoop.hdfs.server.federation.router.PeriodicService$1.run(PeriodicService.java:178) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-14901) RBF: Add Encryption Zone related ClientProtocol APIs
hemanthboyina created HDFS-14901: Summary: RBF: Add Encryption Zone related ClientProtocol APIs Key: HDFS-14901 URL: https://issues.apache.org/jira/browse/HDFS-14901 Project: Hadoop HDFS Issue Type: Sub-task Reporter: hemanthboyina Assignee: hemanthboyina -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Please cherry pick commits to lower branches
I spent the whole last week cherry picking commits from trunk/branch-3.2 to branch-3.1 (should've done this prior to 3.1.4 code freeze). There were about 50-60 of them, many of them are conflict-free, and several of them are critical bug fixes. If your commit stays in trunk, it'll be useless for the community until the next minor release, and many months after people start using the new release. Here are a few tips: (1) update dependency to address a know security vulnerability, should be cherry picked into all lower branches, especially when it updates the maintenance release number. Example: update commons-compress from 1.18 to 1.19. (2) blocker/critical bug fixes should be backported to all applicable branches. (3) because of the removal of commons-logging and a few code refactors, commits may apply cleanly but doesn't compile in branch-3.2, branch-3.1 and lower branches. Please spend the time to verify a commit is good. Best Weichiu
Re: [DISCUSS] Release Docs pointers Hadoop site
To be honest, I have no idea. I don't know about the historical meaning. But as there is no other feedback, here are my guesses based on pure logic: * current -> should point to the release with the highest number (3.2.1) * stable -> to the stable 3.x release with the highest number (3.2.1 as of now) current2 -> latest 2.x release stable2 -> latest stable 2.x release >> 1. But if the release manager of 3.1 line thinks 3.1.3 is stable, and 3.2 >> line is also in stable state, which release should get precedence to be >> called as *stable* in any release line (2.x or 3.x) ? It depends if stable2 = (second highest stable) or (stable from the 2.x line). I think the second meaning is more reasonable. >> 3.1.3 is getting released now, could >> http://hadoop.apache.org/docs/current/ shall be updated to 3.1.3 ? is it >> the norms ? No. As the stable should point to the highest stable, not to the stable which was released recently. Marton On 9/30/19 10:09 AM, Sunil Govindan wrote: Bumping up this thread again for feedback. @Zhankun Tang is now waiting for a confirmation to complete 3.1.3 release publish activities. - Sunil On Fri, Sep 27, 2019 at 11:03 AM Sunil Govindan wrote: Hi Folks, At present, http://hadoop.apache.org/docs/stable/ points to *Apache Hadoop 3.2.1* http://hadoop.apache.org/docs/current/ points to *Apache Hadoop 3.2.1* http://hadoop.apache.org/docs/stable2/ points to *Apache Hadoop 2.9.2* http://hadoop.apache.org/docs/current2/ points to *Apache Hadoop 2.9.2* 3.2.1 is released last day. *Now 3.1.3 has completed voting* and it is in the final stages of staging As per me, a) 3.2.1 will be still be pointing to http://hadoop.apache.org/docs/stable/ ? b) 3.1.3 should be pointing to http://hadoop.apache.org/docs/current/ ? Now my questions, 1. But if the release manager of 3.1 line thinks 3.1.3 is stable, and 3.2 line is also in stable state, which release should get precedence to be called as *stable* in any release line (2.x or 3.x) ? or do we need a vote or discuss thread to decide which release shall be called as stable per release line? 2. Given 3.2.1 is released and pointing to 3.2.1 as stable, then when 3.1.3 is getting released now, could http://hadoop.apache.org/docs/current/ shall be updated to 3.1.3 ? is it the norms ? Thanks Sunil - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org