[jira] [Created] (YARN-8811) Support Container Storage Interface (CSI) in YARN
Weiwei Yang created YARN-8811: - Summary: Support Container Storage Interface (CSI) in YARN Key: YARN-8811 URL: https://issues.apache.org/jira/browse/YARN-8811 Project: Hadoop YARN Issue Type: New Feature Reporter: Weiwei Yang The Container Storage Interface (CSI) is a vendor neutral interface to bridge Container Orchestrators and Storage Providers. With the adoption of CSI in YARN, it will be easier to integrate 3rd party storage systems, and provide the ability to attach persistent volumes for stateful applications. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/903/ No changes -1 overall The following subsystems voted -1: asflicense findbugs pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine Unread field:FSBasedSubmarineStorageImpl.java:[line 39] Found reliance on default encoding in org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter.generateCommandLaunchScript(RunJobParameters, TaskType, Component):in org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter.generateCommandLaunchScript(RunJobParameters, TaskType, Component): new java.io.FileWriter(File) At YarnServiceJobSubmitter.java:[line 192] org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter.generateCommandLaunchScript(RunJobParameters, TaskType, Component) may fail to clean up java.io.Writer on checked exception Obligation to clean up resource created at YarnServiceJobSubmitter.java:to clean up java.io.Writer on checked exception Obligation to clean up resource created at YarnServiceJobSubmitter.java:[line 192] is not discharged org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils.getComponentArrayJson(String, int, String) concatenates strings using + in a loop At YarnServiceUtils.java:using + in a loop At YarnServiceUtils.java:[line 72] Failed CTEST tests : test_test_libhdfs_threaded_hdfs_static test_libhdfs_threaded_hdfspp_test_shim_static Failed junit tests : hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.yarn.server.resourcemanager.metrics.TestSystemMetricsPublisher cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/diff-compile-javac-root.txt [304K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/diff-checkstyle-root.txt [17M] pathlen: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/diff-patch-shellcheck.txt [20K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/diff-patch-shelldocs.txt [16K] whitespace: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/whitespace-eol.txt [9.4M] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/whitespace-tabs.txt [1.1M] xml: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/xml.txt [4.0K] findbugs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/branch-findbugs-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-applications_hadoop-yarn-submarine-warnings.html [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/branch-findbugs-hadoop-hdds_client.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/branch-findbugs-hadoop-hdds_container-service.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/branch-findbugs-hadoop-hdds_framework.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/branch-findbugs-hadoop-hdds_server-scm.txt [12K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/branch-findbugs-hadoop-hdds_tools.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/branch-findbugs-hadoop-ozone_client.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/branch-findbugs-hadoop-ozone_common.txt [4.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/branch-findbugs-hadoop-ozone_objectstore-service.txt [8.0K] https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/890/artifact/out/branch-findbugs-hadoop-ozone_ozone-manager.txt [4.0K]
[jira] [Created] (YARN-8810) Yarn Service: discrepancy between hashcode and equals of ConfigFile
Chandni Singh created YARN-8810: --- Summary: Yarn Service: discrepancy between hashcode and equals of ConfigFile Key: YARN-8810 URL: https://issues.apache.org/jira/browse/YARN-8810 Project: Hadoop YARN Issue Type: Bug Reporter: Chandni Singh Assignee: Chandni Singh The {{ConfigFile}} class {{equals}} method doesn't check the equality of {{properties}}. The {{hashCode}} does include the {{properties}} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8809) Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released.
Haibo Chen created YARN-8809: Summary: Fair Scheduler does not decrement queue metrics when OPPORTUNISTIC containers are released. Key: YARN-8809 URL: https://issues.apache.org/jira/browse/YARN-8809 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler Affects Versions: YARN-1011 Reporter: Haibo Chen Assignee: Haibo Chen -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8808) Use aggregate container utilization instead of node utilization to determine resources available for oversubscription
Haibo Chen created YARN-8808: Summary: Use aggregate container utilization instead of node utilization to determine resources available for oversubscription Key: YARN-8808 URL: https://issues.apache.org/jira/browse/YARN-8808 Project: Hadoop YARN Issue Type: Sub-task Affects Versions: YARN-1011 Reporter: Haibo Chen Assignee: Haibo Chen -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8807) FairScheduler crashes RM with oversubscription turned on if an application is killed.
Haibo Chen created YARN-8807: Summary: FairScheduler crashes RM with oversubscription turned on if an application is killed. Key: YARN-8807 URL: https://issues.apache.org/jira/browse/YARN-8807 Project: Hadoop YARN Issue Type: Sub-task Components: fairscheduler, resourcemanager Affects Versions: YARN-1011 Reporter: Haibo Chen Assignee: Haibo Chen -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8806) Enable local staging directory and clean it up when submarine job is submitted
Zac Zhou created YARN-8806: -- Summary: Enable local staging directory and clean it up when submarine job is submitted Key: YARN-8806 URL: https://issues.apache.org/jira/browse/YARN-8806 Project: Hadoop YARN Issue Type: Sub-task Environment: In the /tmp dir, there are launch scripts which are not cleaned up as follows: -rw-r--r-- 1 hadoop netease 1100 Sep 18 10:46 PRIMARY_WORKER-launch-script8635233314077649086.sh -rw-r--r-- 1 hadoop netease 1100 Sep 18 10:46 WORKER-launch-script129488020578466938.sh -rw-r--r-- 1 hadoop netease 1028 Sep 18 10:46 PS-launch-script471092031021738136.sh Reporter: Zac Zhou Assignee: Zac Zhou YarnServiceJobSubmitter.generateCommandLaunchScript creates container launch scripts in the local filesystem. Container launch scripts would be uploaded to hdfs staging dir, but would not be not deleted after the job is submitted -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/902/ [Sep 19, 2018 3:05:25 AM] (xyao) HDDS-488. Handle chill mode exception from SCM in OzoneManager. [Sep 19, 2018 3:23:50 AM] (xiao) HDFS-13833. Improve BlockPlacementPolicyDefault's consider load logic. [Sep 19, 2018 4:18:15 AM] (arp) HDDS-497. Suppress license warnings for error log files. Contributed by [Sep 19, 2018 6:21:29 AM] (xyao) HDDS-506. Fields in AllocateScmBlockResponseProto should be optional. [Sep 19, 2018 10:12:20 AM] (weichiu) HDFS-13868. WebHDFS: GETSNAPSHOTDIFF API NPE when param "snapshotname" [Sep 19, 2018 11:31:07 AM] (wwei) YARN-8771. CapacityScheduler fails to unreserve when cluster resource [Sep 19, 2018 1:22:08 PM] (nanda) HDDS-476. Add Pipeline reports to make pipeline active on SCM restart. [Sep 19, 2018 2:16:25 PM] (nanda) HDDS-502. Exception in OM startup when running unit tests. Contributed [Sep 19, 2018 3:11:05 PM] (elek) HDDS-458. numberofKeys is 0 for all containers even when keys are [Sep 19, 2018 4:13:44 PM] (nanda) HDDS-461. Container remains in CLOSING state in SCM forever. Contributed [Sep 19, 2018 4:30:25 PM] (inigoiri) HDFS-13908. TestDataNodeMultipleRegistrations is flaky. Contributed by [Sep 19, 2018 4:48:55 PM] (sunilg) YARN-8757. [Submarine] Add Tensorboard component when --tensorboard is [Sep 19, 2018 5:16:11 PM] (eyang) YARN-8791. Trim docker inspect output for line feed for STOPSIGNAL [Sep 19, 2018 6:21:50 PM] (nanda) HDDS-460. Replication manager failed to import container data. [Sep 19, 2018 7:52:18 PM] (nanda) HDDS-507. EventQueue should be shutdown on SCM shutdown. Contributed by [Sep 19, 2018 8:00:30 PM] (inigoiri) HADOOP-15684. triggerActiveLogRoll stuck on dead name node, when [Sep 19, 2018 8:03:19 PM] (aengineer) HDDS-509. TestStorageContainerManager is flaky. Contributed by Xiaoyu [Sep 19, 2018 8:22:37 PM] (cliang) HADOOP-15726. Create utility to limit frequency of log statements. [Sep 19, 2018 8:48:27 PM] (arp) HADOOP-15772. Remove the 'Path ... should be specified as a URI' [Sep 19, 2018 9:35:29 PM] (bharat) HDDS-513. Check if the EventQueue is not closed before executing [Sep 19, 2018 9:44:51 PM] (jlowe) YARN-8784. DockerLinuxContainerRuntime prevents access to distributed -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-submarine Found reliance on default encoding in org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter.generateCommandLaunchScript(RunJobParameters, TaskType, Component):in org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceJobSubmitter.generateCommandLaunchScript(RunJobParameters, TaskType, Component): new java.io.FileWriter(File) At YarnServiceJobSubmitter.java:[line 208] org.apache.hadoop.yarn.submarine.runtimes.yarnservice.YarnServiceUtils.getComponentArrayJson(String, int, String) concatenates strings using + in a loop At YarnServiceUtils.java:using + in a loop At YarnServiceUtils.java:[line 92] Failed CTEST tests : test_test_libhdfs_threaded_hdfs_static test_libhdfs_threaded_hdfspp_test_shim_static Failed junit tests : hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.yarn.client.api.impl.TestAMRMProxy cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/902/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/902/artifact/out/diff-compile-javac-root.txt [300K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/902/artifact/out/diff-checkstyle-root.txt [17M] hadolint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/902/artifact/out/diff-patch-hadolint.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/902/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/902/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/902/artifact/out/diff-patch-shellcheck.txt [20K] shelldocs: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/902/artifact/out/diff-patch-shelldocs.txt [16K] whitespace:
[jira] [Created] (YARN-8805) Automatically convert the launch command to the exec form when using entrypoint support
Shane Kumpf created YARN-8805: - Summary: Automatically convert the launch command to the exec form when using entrypoint support Key: YARN-8805 URL: https://issues.apache.org/jira/browse/YARN-8805 Project: Hadoop YARN Issue Type: Sub-task Reporter: Shane Kumpf When {{YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE}} is true, and a launch command is provided, it is expected that the launch command is provided by the user in exec form. For example: {code:java} "/usr/bin/sleep 6000"{code} must be changed to: {code}"/usr/bin/sleep,6000"{code} If this is not done, the container will never start and will be in a Created state. We should automatically do this conversion vs making the user understand this nuance of using the entrypoint support. Docs should be updated to reflect this change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8804) resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues
Tao Yang created YARN-8804: -- Summary: resourceLimits may be wrongly calculated when leaf-queue is blocked in cluster with 3+ level queues Key: YARN-8804 URL: https://issues.apache.org/jira/browse/YARN-8804 Project: Hadoop YARN Issue Type: Bug Components: capacityscheduler Affects Versions: 3.2.0 Reporter: Tao Yang Assignee: Tao Yang This problem is due to YARN-4280, parent queue will deduct child queue's headroom when the child queue reached its resource limit and the skipped type is QUEUE_LIMIT, the resource limits of deepest parent queue will be correctly calculated, but for non-deepest parent queue, its headroom may be much more than the sum of reached-limit child queues' headroom, so that the resource limit of non-deepest parent may be much less than its true value and block the allocation for later queues. To reproduce this problem with UT: (1) Cluster has two nodes whose node resource both are <10GB, 10core> and 3-level queues as below, among them max-capacity of "c1" is 10 and others are all 100, so that max-capacity of queue "c1" is <2GB, 2core> Root / |\ a bc 10 20 70 | \ c1 c2 10(max=10) 90 (2) Submit app1 to queue "c1" and launch am1(resource=<1GB, 1 core>) on nm1 (3) Submit app2 to queue "b" and launch am2(resource=<1GB, 1 core>) on nm1 (4) app1 and app2 both ask one <2GB, 1core> containers. (5) nm1 do 1 heartbeat Now queue "c" has lower capacity percentage than queue "b", the allocation sequence will be "a" -> "c" -> "b", queue "c1" has reached queue limit so that requests of app1 should be pending, headroom of queue "c1" is <1GB, 1core> (=max-capacity - used), headroom of queue "c" is <18GB, 18core> (=max-capacity - used), after allocation for queue "c", resource limit of queue "b" will be wrongly calculated as <2GB, 2core>, headroom of queue "b" will be <1GB, 1core> (=resource-limit - used) so that scheduler won't allocate one container for app2 on nm1 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org