[jira] [Created] (HDDS-286) Fix NodeReportPublisher.getReport NPE
Xiaoyu Yao created HDDS-286: --- Summary: Fix NodeReportPublisher.getReport NPE Key: HDDS-286 URL: https://issues.apache.org/jira/browse/HDDS-286 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Xiaoyu Yao This can be reproed with TestKeys#testPutKey {code} 2018-07-23 21:33:55,598 WARN concurrent.ExecutorHelper (ExecutorHelper.java:logThrowableFromAfterExecute(63)) - Caught exception in thread Datanode ReportManager Thread - 0: java.lang.NullPointerException at org.apache.hadoop.ozone.container.common.volume.VolumeInfo.getScmUsed(VolumeInfo.java:107) at org.apache.hadoop.ozone.container.common.volume.VolumeSet.getNodeReport(VolumeSet.java:350) at org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.getNodeReport(OzoneContainer.java:260) at org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:64) at org.apache.hadoop.ozone.container.common.report.NodeReportPublisher.getReport(NodeReportPublisher.java:39) at org.apache.hadoop.ozone.container.common.report.ReportPublisher.publishReport(ReportPublisher.java:86) at org.apache.hadoop.ozone.container.common.report.ReportPublisher.run(ReportPublisher.java:73) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/846/ No changes -1 overall The following subsystems voted -1: asflicense findbugs pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: Failed CTEST tests : test_test_libhdfs_threaded_hdfs_static test_libhdfs_threaded_hdfspp_test_shim_static Failed junit tests : hadoop.util.TestDiskCheckerWithDiskIo hadoop.util.TestBasicDiskValidator hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA hadoop.hdfs.qjournal.server.TestJournalNodeSync hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher hadoop.mapred.TestMRTimelineEventHandling cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/diff-compile-javac-root.txt [332K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/diff-checkstyle-root.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/diff-patch-shellcheck.txt [20K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/diff-patch-shelldocs.txt [16K] whitespace: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/whitespace-eol.txt [9.4M] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/whitespace-tabs.txt [1.1M] xml: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/xml.txt [4.0K] findbugs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/branch-findbugs-hadoop-hdds_client.txt [56K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/branch-findbugs-hadoop-hdds_container-service.txt [52K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/branch-findbugs-hadoop-hdds_framework.txt [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/branch-findbugs-hadoop-hdds_server-scm.txt [56K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/branch-findbugs-hadoop-hdds_tools.txt [16K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/branch-findbugs-hadoop-ozone_client.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/branch-findbugs-hadoop-ozone_common.txt [28K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/branch-findbugs-hadoop-ozone_objectstore-service.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/branch-findbugs-hadoop-ozone_ozone-manager.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/branch-findbugs-hadoop-ozone_ozonefs.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/branch-findbugs-hadoop-ozone_tools.txt [4.0K] javadoc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/diff-javadoc-javadoc-root.txt [760K] CTEST: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/patch-hadoop-hdfs-project_hadoop-hdfs-native-client-ctest.txt [116K] unit: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt [192K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt [336K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-native-client.txt [112K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/843/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt [80K]
[jira] [Created] (HDDS-285) Create a generic Metadata Iterator
Bharat Viswanadham created HDDS-285: --- Summary: Create a generic Metadata Iterator Key: HDDS-285 URL: https://issues.apache.org/jira/browse/HDDS-285 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Bharat Viswanadham Assignee: Bharat Viswanadham This Jira is to track the work to have a wrapper class Iterator and use that iterator during iterating db. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-284) Interleaving CRC for ChunksData
Bharat Viswanadham created HDDS-284: --- Summary: Interleaving CRC for ChunksData Key: HDDS-284 URL: https://issues.apache.org/jira/browse/HDDS-284 Project: Hadoop Distributed Data Store Issue Type: Improvement Reporter: Bharat Viswanadham Assignee: Shashikant Banerjee This Jira is to add CRC for chunks data. Right now, in chunkInfo, data is just byte array. We want to change this as below: _message Data {_ _required string magic =”Tullys00” = 1;_ _required CRCTYPE crcType =_ _2__;_ _optional String LegacyMetadata = 3; // Fields to support inplace data migration_ _optional String LegacyData = 4;_ _optional ushort ChecksumblockSize = 5; // Size of the block used to compute the checksums_ _repeated uint32 checksums = 6 ; // Set of Checksums_ *_repeated byte data = 6;_* _// Actual Data stream_ _}_ _This will help in error detection for containers during container scanner._ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13761) Add toString Method to AclFeature Class
Shweta created HDFS-13761: - Summary: Add toString Method to AclFeature Class Key: HDFS-13761 URL: https://issues.apache.org/jira/browse/HDFS-13761 Project: Hadoop HDFS Issue Type: Improvement Reporter: Shweta Assignee: Shweta -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Windows/x64
For more details, see https://builds.apache.org/job/hadoop-trunk-win/536/ [Jul 23, 2018 3:43:03 AM] (msingh) HDDS-181. CloseContainer should commit all pending open Keys on a [Error replacing 'FILE' - Workspace is not accessible] - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13760) improve ZKFC fencing action when network of ZKFC interrupt
He Xiaoqiao created HDFS-13760: -- Summary: improve ZKFC fencing action when network of ZKFC interrupt Key: HDFS-13760 URL: https://issues.apache.org/jira/browse/HDFS-13760 Project: Hadoop HDFS Issue Type: Improvement Components: ha Reporter: He Xiaoqiao when host of Active NameNode & ZKFC meet network fault for quite a time, HDFS will be not available since ZKFC located on Standby NameNode will never ssh fence success due to it could not ssh to Active NameNode. In such situation, for Client, it could not connect to Active NameNode, then failover to Standby but it could not provide READ/WRITE. {code:xml} 2018-07-23 15:57:10,836 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 40 time(s); maxRetries=45 2018-07-23 15:57:30,856 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 41 time(s); maxRetries=45 2018-07-23 15:57:50,872 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 42 time(s); maxRetries=45 2018-07-23 15:58:10,892 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 43 time(s); maxRetries=45 2018-07-23 15:58:30,912 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: rz-data-hdp-nn14.rz.sankuai.com/10.16.70.34:8060. Already tried 44 time(s); maxRetries=45 2018-07-23 15:58:50,933 INFO org.apache.hadoop.ha.ZKFailoverController: get old active state exception: org.apache.hadoop.net.ConnectTimeoutException: 2 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending local=/ip:port remote=hostname] 2018-07-23 15:58:50,933 INFO org.apache.hadoop.ha.ActiveStandbyElector: old active is not healthy. need to create znode 2018-07-23 15:58:50,933 INFO org.apache.hadoop.ha.ActiveStandbyElector: Elector callbacks for NameNode at standbynn start create node, now time: 45179010079342817 2018-07-23 15:58:50,936 INFO org.apache.hadoop.ha.ActiveStandbyElector: CreateNode result: 0 code:OK for path: /hadoop-ha/ns/ActiveStandbyElectorLock connectionState: CONNECTED for elector id=469098346 appData=0a07727a2d6e6e313312046e6e31331a1f727a2d646174612d6864702d6e6e31332e727a2e73616e6b7561692e636f6d20e83e28d33e cb=Elector callbacks for NameNode at standbynamenode 2018-07-23 15:58:50,936 INFO org.apache.hadoop.ha.ActiveStandbyElector: Checking for any old active which needs to be fenced... 2018-07-23 15:58:50,938 INFO org.apache.hadoop.ha.ActiveStandbyElector: Old node exists: 0a07727a2d6e6e313312046e6e31341a1f727a2d646174612d6864702d6e6e31342e727a2e73616e6b7561692e636f6d20e83e28d33e 2018-07-23 15:58:50,939 INFO org.apache.hadoop.ha.ZKFailoverController: Should fence: NameNode at activenamenode 2018-07-23 15:59:10,960 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: activenamenode. Already tried 0 time(s); maxRetries=1 2018-07-23 15:59:30,980 WARN org.apache.hadoop.ha.FailoverController: Unable to gracefully make NameNode at activenamenode standby (unable to connect) org.apache.hadoop.net.ConnectTimeoutException: Call From standbynamenode to activenamenode failed on socket timeout exception: org.apache.hadoop.net.ConnectTimeoutException: 2 millis timeout while waiting for channel to be ready for connect. ch : java.nio.channels.SocketChannel[connection-pending local=ip:port remote=activenamenode]; For more details see: http://wiki.apache.org/hadoop/SocketTimeout {code} I propose that when Active NameNode meet network fault, ZKFC force this NameNode to become Standby, and another ZKFC could hold the ZNode for election and transition other NameNode to Active even when ssh fence fail. There is no available patch now, and I am very welcome to hear some suggestion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-283) Need an option to list all volumes created in the cluster
Nilotpal Nandi created HDDS-283: --- Summary: Need an option to list all volumes created in the cluster Key: HDDS-283 URL: https://issues.apache.org/jira/browse/HDDS-283 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Reporter: Nilotpal Nandi Fix For: 0.2.1 Currently , listVolume command either gives : 1) all the volumes created by a particular user , using -user argument. 2) or , all the volumes created by the logged in user , if no -user argument is provided. We need an option to list all the volumes created in the cluster -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-282) Consolidate logging in scm/container-service
Elek, Marton created HDDS-282: - Summary: Consolidate logging in scm/container-service Key: HDDS-282 URL: https://issues.apache.org/jira/browse/HDDS-282 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Elek, Marton Assignee: Elek, Marton Fix For: 0.2.1 During real cluster tests, I found some logging/error handling very annoying. I propose to improve the following behaviour: # In case of datanode-> scm communication failure we don't log the exception. As there (EndpointStateMachine.java:L206). As the messages have already been throttled I think it's safe to log the exception. # In BlockDeletingServlce:L123, I would log the message (Plan to choose {} containers for block deletion, actually returns {} valid containers) only if the number of valid containers is greater than 0. # EventQueue could log a warning if handlers is missing for a message (instead of an exception) # TypedEvent should have a toString method (as it's used in the EventQueue logging). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-281) Need container size distribution metric in OzoneManager UI
Nilotpal Nandi created HDDS-281: --- Summary: Need container size distribution metric in OzoneManager UI Key: HDDS-281 URL: https://issues.apache.org/jira/browse/HDDS-281 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Manager Reporter: Nilotpal Nandi It would be good if we have some metric/histogram in OzoneManager UI indicating the different container size range and corresponding percentages for the same created in the cluster. For example : 0-2 GB 10% 2-4 GB . 20% 4-5 GB 70% 5+ GB 0% -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-280) Support ozone dist-start-stitching on openbsd/osx
Elek, Marton created HDDS-280: - Summary: Support ozone dist-start-stitching on openbsd/osx Key: HDDS-280 URL: https://issues.apache.org/jira/browse/HDDS-280 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Elek, Marton {quote}Ozone is creating a symlink during the dist process. Using the "ozone" directory as a destination name all the docker-based acceptance tests and docker-compose files are more simple as they don't need to have the version information in the path. But to keep the version specific folder name in the tar file we create a symbolic link during the tar creation. With the symbolic link and the '–dereference' tar argument we can create the tar file which includes a versioned directory (ozone-0.2.1) but we can use the a dist directory without the version in the name (hadoop-dist/target/ozone). {quote} This is the description of the current dev-support/bin/ozone-dist-tar-stitching. [~aw] in a comment for HDDS-276 pointed to the problem that some bsd variants don't support the dereference command line options of the ln command. The main reason to use this approach is to get a simplified destination name without the version (hadoop-dist/target/ozone instead of hadoop-dist/target/ozone-0.2.1). It simplifies the docker-compose based environments and acceptance tests, therefore I prefer to keep the simplified destination name. The issue is the tar file creation, if and only if we need the version number in the name of the root directory inside of the tar. Possible solutions: # Use cp target/ozone target/ozone-0.2.1 + tar. It's simple but more slow and requires more space. # Do the tar distribution from docker all the time in case of 'dereference' is not supported. Not very convenient # Accept that tar will contain ozone directory and not ozone-0.2.1. This is the more simple and can be improved with an additional VERSION file in the root of the distribution. # (+1) Use hadoop-dist/target/ozone-0.2.1 instead of hadoop-dist/target/ozone. This is more complex for the docker based testing as we need the explicit names in the compose files (volume: ../../../hadoop-dist/target/ozone-0.2.1). The structure is more complex with using version in the directory name. Please comment your preference. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org