[jira] [Created] (YARN-8940) Add volume as a top-level attribute in service spec
Weiwei Yang created YARN-8940: - Summary: Add volume as a top-level attribute in service spec Key: YARN-8940 URL: https://issues.apache.org/jira/browse/YARN-8940 Project: Hadoop YARN Issue Type: Sub-task Reporter: Weiwei Yang Initial thought: {noformat} { "name": "volume example", "version": "1.0.0", "description": "a volume simple example", "components" : [ { "name": "", "number_of_containers": 1, "artifact": { "id": "docker.io/centos:latest", "type": "DOCKER" }, "launch_command": "sleep,120", "configuration": { "env": { "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true" } }, "resource": { "cpus": 1, "memory": "256", }, "volumes": [ { "volume" : { "type": "s3_csi", "id": "5504d4a8-b246-11e8-94c2-026b17aa1190", "capability" : { "min": "5Gi", "max": "100Gi" }, "source_path": "s3://my_bucket/my", # optional for object stores "mount_path": "/mnt/data", # required, the mount point in docker container "access_mode": "SINGLE_READ", # how the volume can be accessed } } ] } } ] } {noformat} Open for discussion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized
[ https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Yufei resolved YARN-8513. -- Resolution: Fixed Fix Version/s: 3.2.1 3.1.2 > CapacityScheduler infinite loop when queue is near fully utilized > - > > Key: YARN-8513 > URL: https://issues.apache.org/jira/browse/YARN-8513 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler, yarn >Affects Versions: 3.1.0, 2.9.1 > Environment: Ubuntu 14.04.5 and 16.04.4 > YARN is configured with one label and 5 queues. >Reporter: Chen Yufei >Priority: Major > Fix For: 3.1.2, 3.2.1 > > Attachments: jstack-1.log, jstack-2.log, jstack-3.log, jstack-4.log, > jstack-5.log, top-during-lock.log, top-when-normal.log, yarn3-jstack1.log, > yarn3-jstack2.log, yarn3-jstack3.log, yarn3-jstack4.log, yarn3-jstack5.log, > yarn3-resourcemanager.log, yarn3-top > > > ResourceManager does not respond to any request when queue is near fully > utilized sometimes. Sending SIGTERM won't stop RM, only SIGKILL can. After RM > restart, it can recover running jobs and start accepting new ones. > > Seems like CapacityScheduler is in an infinite loop printing out the > following log messages (more than 25,000 lines in a second): > > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: > assignedContainer queue=root usedCapacity=0.99816763 > absoluteUsedCapacity=0.99816763 used= > cluster=}} > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: > Failed to accept allocation proposal}} > {{2018-07-10 17:16:29,227 INFO > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator: > assignedContainer application attempt=appattempt_1530619767030_1652_01 > container=null > queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@14420943 > clusterResource= type=NODE_LOCAL > requestedPartition=}} > > I encounter this problem several times after upgrading to YARN 2.9.1, while > the same configuration works fine under version 2.7.3. > > YARN-4477 is an infinite loop bug in FairScheduler, not sure if this is a > similar problem. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8939) Javadoc build fails in hadoop-yarn-csi
Takanobu Asanuma created YARN-8939: -- Summary: Javadoc build fails in hadoop-yarn-csi Key: YARN-8939 URL: https://issues.apache.org/jira/browse/YARN-8939 Project: Hadoop YARN Issue Type: Bug Reporter: Takanobu Asanuma Assignee: Takanobu Asanuma {noformat} $ mvn javadoc:javadoc --projects hadoop-yarn-project/hadoop-yarn/hadoop-yarn-csi ... [ERROR] /hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-csi/src/main/java/org/apache/hadoop/yarn/csi/client/CsiGrpcClient.java:92: error: exception not thrown: java.lang.InterruptedException [ERROR]* @throws InterruptedException {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Resolved] (YARN-6586) YARN to facilitate HTTPS in AM web server
[ https://issues.apache.org/jira/browse/YARN-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kanter resolved YARN-6586. - Resolution: Fixed Fix Version/s: 3.3.0 All subtasks are now complete. Thanks for the reviews, especially [~haibochen]; and the help with the dependency issues, especially [~eyang]. > YARN to facilitate HTTPS in AM web server > - > > Key: YARN-6586 > URL: https://issues.apache.org/jira/browse/YARN-6586 > Project: Hadoop YARN > Issue Type: Improvement > Components: yarn >Affects Versions: 3.0.0-alpha2 >Reporter: Haibo Chen >Assignee: Robert Kanter >Priority: Major > Fix For: 3.3.0 > > Attachments: Design Document v1.pdf, Design Document v2.pdf, > YARN-6586.poc.patch > > > MR AM today does not support HTTPS in its web server, so the traffic between > RMWebproxy and MR AM is in clear text. > MR cannot easily achieve this mainly because MR AMs are untrusted by YARN. A > potential solution purely within MR, similar to what Spark has implemented, > is to allow users, when they enable HTTPS in MR job, to provide their own > keystore file, and then the file is uploaded to distributed cache and > localized for MR AM container. The configuration users need to do is complex. > More importantly, in typical deployments, however, web browsers go through > RMWebProxy to indirectly access MR AM web server. In order to support MR AM > HTTPs, RMWebProxy therefore needs to trust the user-provided keystore, which > is problematic. > Alternatively, we can add an endpoint in NM web server that acts as a proxy > between AM web server and RMWebProxy. RMWebproxy, when configured to do so, > will send requests in HTTPS to the NM on which the AM is running, and the NM > then can communicate with the local AM web server in HTTP. This adds one > hop between RMWebproxy and AM, but both MR and Spark can use such solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8938) Add service upgrade cancel and express examples to the service upgrade doc
Chandni Singh created YARN-8938: --- Summary: Add service upgrade cancel and express examples to the service upgrade doc Key: YARN-8938 URL: https://issues.apache.org/jira/browse/YARN-8938 Project: Hadoop YARN Issue Type: Bug Reporter: Chandni Singh Assignee: Chandni Singh -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
[jira] [Created] (YARN-8937) TestLeaderElectorService hangs
Jason Lowe created YARN-8937: Summary: TestLeaderElectorService hangs Key: YARN-8937 URL: https://issues.apache.org/jira/browse/YARN-8937 Project: Hadoop YARN Issue Type: Bug Affects Versions: 3.3.0 Reporter: Jason Lowe TestLeaderElectorService hangs waiting for the TestingZooKeeperServer to start and eventually gets killed by the surefire timeout. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/935/ [Oct 22, 2018 5:49:16 AM] (aengineer) HDDS-705. OS3Exception resource name should be the actual resource name. [Oct 22, 2018 5:50:28 AM] (aengineer) HDDS-707. Allow registering MBeans without additional jmx properties. [Oct 22, 2018 6:18:38 AM] (aengineer) HDDS-544. Unconditional wait findbug warning from ReplicationSupervisor. [Oct 22, 2018 8:45:51 AM] (sunilg) YARN-7502. Nodemanager restart docs should describe nodemanager [Oct 22, 2018 10:07:40 AM] (nanda) Revert "HDDS-705. OS3Exception resource name should be the actual [Oct 22, 2018 10:18:36 AM] (nanda) HDDS-705. OS3Exception resource name should be the actual resource name. [Oct 22, 2018 10:21:12 AM] (stevel) HADOOP-15866. Renamed HADOOP_SECURITY_GROUP_SHELL_COMMAND_TIMEOUT keys [Oct 22, 2018 2:29:35 PM] (msingh) HDDS-638. Enable ratis snapshots for HDDS datanodes. Contributed by [Oct 22, 2018 5:17:12 PM] (bharat) HDDS-705. addendum patch to fix find bug issue. Contributed by Bharat [Oct 22, 2018 7:28:58 PM] (eyang) YARN-8922. Fixed test-container-executor test setup and clean up. [Oct 22, 2018 7:59:52 PM] (eyang) YARN-8542. Added YARN service REST API to list containers. [Oct 22, 2018 9:44:28 PM] (arp) HDFS-13941. make storageId in BlockPoolTokenSecretManager.checkAccess [Oct 22, 2018 10:39:57 PM] (eyang) YARN-8923. Cleanup references to ENV file type in YARN service code. [Oct 22, 2018 10:57:01 PM] (aengineer) HDDS-676. Enable Read from open Containers via Standalone Protocol. -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml FindBugs : module:hadoop-common-project/hadoop-registry Exceptional return value of java.util.concurrent.ExecutorService.submit(Callable) ignored in org.apache.hadoop.registry.server.dns.RegistryDNS.addNIOTCP(InetAddress, int) At RegistryDNS.java:ignored in org.apache.hadoop.registry.server.dns.RegistryDNS.addNIOTCP(InetAddress, int) At RegistryDNS.java:[line 900] Exceptional return value of java.util.concurrent.ExecutorService.submit(Callable) ignored in org.apache.hadoop.registry.server.dns.RegistryDNS.addNIOUDP(InetAddress, int) At RegistryDNS.java:ignored in org.apache.hadoop.registry.server.dns.RegistryDNS.addNIOUDP(InetAddress, int) At RegistryDNS.java:[line 926] Exceptional return value of java.util.concurrent.ExecutorService.submit(Callable) ignored in org.apache.hadoop.registry.server.dns.RegistryDNS.serveNIOTCP(ServerSocketChannel, InetAddress, int) At RegistryDNS.java:ignored in org.apache.hadoop.registry.server.dns.RegistryDNS.serveNIOTCP(ServerSocketChannel, InetAddress, int) At RegistryDNS.java:[line 850] Failed CTEST tests : test_test_libhdfs_threaded_hdfs_static test_libhdfs_threaded_hdfspp_test_shim_static Failed junit tests : hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized hadoop.hdfs.TestLeaseRecovery2 hadoop.hdfs.web.TestWebHdfsTimeouts hadoop.yarn.server.resourcemanager.TestRMAdminService hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueManagementDynamicEditPolicy hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens hadoop.yarn.client.api.impl.TestNMClient hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage hadoop.mapreduce.v2.hs.server.TestHSAdminServer hadoop.streaming.TestStreamingBadRecords hadoop.streaming.TestMultipleCachefiles hadoop.streaming.TestMultipleArchiveFiles hadoop.streaming.TestSymLink hadoop.streaming.TestFileArgs hadoop.mapred.gridmix.TestDistCacheEmulation hadoop.mapred.gridmix.TestLoadJob hadoop.mapred.gridmix.TestSleepJob hadoop.mapred.gridmix.TestGridmixSubmission hadoop.tools.TestDistCh cc: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/935/artifact/out/diff-compile-cc-root.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/935/artifact/out/diff-compile-javac-root.txt [296K] checkstyle: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/935/artifact/out/diff-checkstyle-root.txt [17M] hadolint: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/935/artifact/out/diff-patch-hadolint.txt [4.0K]
[jira] [Created] (YARN-8935) stopped yarn service app does not show when "yarn app -list"
kyungwan nam created YARN-8935: -- Summary: stopped yarn service app does not show when "yarn app -list" Key: YARN-8935 URL: https://issues.apache.org/jira/browse/YARN-8935 Project: Hadoop YARN Issue Type: Bug Components: yarn-native-services Environment: stopped yarn service app can be re-started or destroyed even if it does not exist in RM It should show including stopped yarn service app. {code} $ yarn app -list 18/10/23 15:24:19 INFO client.RMProxy: Connecting to ResourceManager at a.com/10.1.1.100:8050 18/10/23 15:24:19 INFO client.AHSProxy: Connecting to Application History server at a.com/10.1.1.100:10200 Total number of applications (application-types: [], states: [SUBMITTED, ACCEPTED, RUNNING] and tags: []):0 Application-Id Application-NameApplication-Type User Queue State Final-State ProgressTracking-URL $ $ yarn app -destroy ats-hbase 18/10/23 15:24:50 INFO client.RMProxy: Connecting to ResourceManager at a.com/10.1.1.100:8050 18/10/23 15:24:51 INFO client.AHSProxy: Connecting to Application History server at a.com/10.1.1.100:10200 18/10/23 15:24:51 INFO client.RMProxy: Connecting to ResourceManager at a.com/10.1.1.100:8050 18/10/23 15:24:51 INFO client.AHSProxy: Connecting to Application History server at a.com/10.1.1.100:10200 18/10/23 15:24:51 INFO util.log: Logging initialized @1617ms 18/10/23 15:24:52 INFO client.ApiServiceClient: Successfully destroyed service ats-hbase {code} Reporter: kyungwan nam -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org