[jira] [Created] (YARN-8940) Add volume as a top-level attribute in service spec

2018-10-23 Thread Weiwei Yang (JIRA)
Weiwei Yang created YARN-8940:
-

 Summary: Add volume as a top-level attribute in service spec 
 Key: YARN-8940
 URL: https://issues.apache.org/jira/browse/YARN-8940
 Project: Hadoop YARN
  Issue Type: Sub-task
Reporter: Weiwei Yang


Initial thought:
{noformat}
{
  "name": "volume example",
  "version": "1.0.0",
  "description": "a volume simple example",
  "components" :
[
  {
"name": "",
"number_of_containers": 1,
"artifact": {
  "id": "docker.io/centos:latest",
  "type": "DOCKER"
},
"launch_command": "sleep,120",
"configuration": {
  "env": {
"YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE":"true"
  }
},
"resource": {
  "cpus": 1,
  "memory": "256",
},
"volumes": [
  {
"volume" : {
  "type": "s3_csi",
  "id": "5504d4a8-b246-11e8-94c2-026b17aa1190",
  "capability" : {
"min": "5Gi",
"max": "100Gi"
  },
  "source_path": "s3://my_bucket/my", # optional for object stores
  "mount_path": "/mnt/data", # required, the mount point in docker 
container
  "access_mode": "SINGLE_READ", # how the volume can be accessed
}
  }
]
  }
}
  ]
}
{noformat}
Open for discussion.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-8513) CapacityScheduler infinite loop when queue is near fully utilized

2018-10-23 Thread Chen Yufei (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-8513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Yufei resolved YARN-8513.
--
   Resolution: Fixed
Fix Version/s: 3.2.1
   3.1.2

> CapacityScheduler infinite loop when queue is near fully utilized
> -
>
> Key: YARN-8513
> URL: https://issues.apache.org/jira/browse/YARN-8513
> Project: Hadoop YARN
>  Issue Type: Bug
>  Components: capacity scheduler, yarn
>Affects Versions: 3.1.0, 2.9.1
> Environment: Ubuntu 14.04.5 and 16.04.4
> YARN is configured with one label and 5 queues.
>Reporter: Chen Yufei
>Priority: Major
> Fix For: 3.1.2, 3.2.1
>
> Attachments: jstack-1.log, jstack-2.log, jstack-3.log, jstack-4.log, 
> jstack-5.log, top-during-lock.log, top-when-normal.log, yarn3-jstack1.log, 
> yarn3-jstack2.log, yarn3-jstack3.log, yarn3-jstack4.log, yarn3-jstack5.log, 
> yarn3-resourcemanager.log, yarn3-top
>
>
> ResourceManager does not respond to any request when queue is near fully 
> utilized sometimes. Sending SIGTERM won't stop RM, only SIGKILL can. After RM 
> restart, it can recover running jobs and start accepting new ones.
>  
> Seems like CapacityScheduler is in an infinite loop printing out the 
> following log messages (more than 25,000 lines in a second):
>  
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: 
> assignedContainer queue=root usedCapacity=0.99816763 
> absoluteUsedCapacity=0.99816763 used= 
> cluster=}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler:
>  Failed to accept allocation proposal}}
> {{2018-07-10 17:16:29,227 INFO 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.AbstractContainerAllocator:
>  assignedContainer application attempt=appattempt_1530619767030_1652_01 
> container=null 
> queue=org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator@14420943
>  clusterResource= type=NODE_LOCAL 
> requestedPartition=}}
>  
> I encounter this problem several times after upgrading to YARN 2.9.1, while 
> the same configuration works fine under version 2.7.3.
>  
> YARN-4477 is an infinite loop bug in FairScheduler, not sure if this is a 
> similar problem.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-8939) Javadoc build fails in hadoop-yarn-csi

2018-10-23 Thread Takanobu Asanuma (JIRA)
Takanobu Asanuma created YARN-8939:
--

 Summary: Javadoc build fails in hadoop-yarn-csi
 Key: YARN-8939
 URL: https://issues.apache.org/jira/browse/YARN-8939
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Takanobu Asanuma
Assignee: Takanobu Asanuma


{noformat}
$ mvn javadoc:javadoc --projects hadoop-yarn-project/hadoop-yarn/hadoop-yarn-csi
...
[ERROR] 
/hadoop/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-csi/src/main/java/org/apache/hadoop/yarn/csi/client/CsiGrpcClient.java:92:
 error: exception not thrown: java.lang.InterruptedException
[ERROR]* @throws InterruptedException
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Resolved] (YARN-6586) YARN to facilitate HTTPS in AM web server

2018-10-23 Thread Robert Kanter (JIRA)


 [ 
https://issues.apache.org/jira/browse/YARN-6586?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter resolved YARN-6586.
-
   Resolution: Fixed
Fix Version/s: 3.3.0

All subtasks are now complete.

Thanks for the reviews, especially [~haibochen]; and the help with the 
dependency issues, especially [~eyang].

> YARN to facilitate HTTPS in AM web server
> -
>
> Key: YARN-6586
> URL: https://issues.apache.org/jira/browse/YARN-6586
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: yarn
>Affects Versions: 3.0.0-alpha2
>Reporter: Haibo Chen
>Assignee: Robert Kanter
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: Design Document v1.pdf, Design Document v2.pdf, 
> YARN-6586.poc.patch
>
>
> MR AM today does not support HTTPS in its web server, so the traffic between 
> RMWebproxy and MR AM is in clear text.
> MR cannot easily achieve this mainly because MR AMs are untrusted by YARN. A 
> potential solution purely within MR, similar to what Spark has implemented, 
> is to allow users, when they enable HTTPS in MR job, to provide their own 
> keystore file, and then the file is uploaded to distributed cache and 
> localized for MR AM container. The configuration users need to do is complex.
> More importantly, in typical deployments, however, web browsers go through 
> RMWebProxy to indirectly access MR AM web server. In order to support MR AM 
> HTTPs, RMWebProxy therefore needs to trust the user-provided keystore, which 
> is problematic.  
> Alternatively, we can add an endpoint in NM web server that acts as a proxy 
> between AM web server and RMWebProxy. RMWebproxy, when configured to do so, 
> will send requests in HTTPS to the NM on which the AM is running, and the NM 
> then can communicate with the local AM web server in HTTP.   This adds one 
> hop between RMWebproxy and AM, but both MR and Spark can use such solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-8938) Add service upgrade cancel and express examples to the service upgrade doc

2018-10-23 Thread Chandni Singh (JIRA)
Chandni Singh created YARN-8938:
---

 Summary: Add service upgrade cancel and express examples to the 
service upgrade doc
 Key: YARN-8938
 URL: https://issues.apache.org/jira/browse/YARN-8938
 Project: Hadoop YARN
  Issue Type: Bug
Reporter: Chandni Singh
Assignee: Chandni Singh






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



[jira] [Created] (YARN-8937) TestLeaderElectorService hangs

2018-10-23 Thread Jason Lowe (JIRA)
Jason Lowe created YARN-8937:


 Summary: TestLeaderElectorService hangs
 Key: YARN-8937
 URL: https://issues.apache.org/jira/browse/YARN-8937
 Project: Hadoop YARN
  Issue Type: Bug
Affects Versions: 3.3.0
Reporter: Jason Lowe


TestLeaderElectorService hangs waiting for the TestingZooKeeperServer to start 
and eventually gets killed by the surefire timeout.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2018-10-23 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/935/

[Oct 22, 2018 5:49:16 AM] (aengineer) HDDS-705. OS3Exception resource name 
should be the actual resource name.
[Oct 22, 2018 5:50:28 AM] (aengineer) HDDS-707. Allow registering MBeans 
without additional jmx properties.
[Oct 22, 2018 6:18:38 AM] (aengineer) HDDS-544. Unconditional wait findbug 
warning from ReplicationSupervisor.
[Oct 22, 2018 8:45:51 AM] (sunilg) YARN-7502. Nodemanager restart docs should 
describe nodemanager
[Oct 22, 2018 10:07:40 AM] (nanda) Revert "HDDS-705. OS3Exception resource name 
should be the actual
[Oct 22, 2018 10:18:36 AM] (nanda) HDDS-705. OS3Exception resource name should 
be the actual resource name.
[Oct 22, 2018 10:21:12 AM] (stevel) HADOOP-15866. Renamed 
HADOOP_SECURITY_GROUP_SHELL_COMMAND_TIMEOUT keys
[Oct 22, 2018 2:29:35 PM] (msingh) HDDS-638. Enable ratis snapshots for HDDS 
datanodes. Contributed by
[Oct 22, 2018 5:17:12 PM] (bharat) HDDS-705. addendum patch to fix find bug 
issue. Contributed by Bharat
[Oct 22, 2018 7:28:58 PM] (eyang) YARN-8922. Fixed test-container-executor test 
setup and clean up.   
[Oct 22, 2018 7:59:52 PM] (eyang) YARN-8542. Added YARN service REST API to 
list containers.   
[Oct 22, 2018 9:44:28 PM] (arp) HDFS-13941. make storageId in 
BlockPoolTokenSecretManager.checkAccess
[Oct 22, 2018 10:39:57 PM] (eyang) YARN-8923.  Cleanup references to ENV file 
type in YARN service code.   
[Oct 22, 2018 10:57:01 PM] (aengineer) HDDS-676. Enable Read from open 
Containers via Standalone Protocol.




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml
 

FindBugs :

   module:hadoop-common-project/hadoop-registry 
   Exceptional return value of 
java.util.concurrent.ExecutorService.submit(Callable) ignored in 
org.apache.hadoop.registry.server.dns.RegistryDNS.addNIOTCP(InetAddress, int) 
At RegistryDNS.java:ignored in 
org.apache.hadoop.registry.server.dns.RegistryDNS.addNIOTCP(InetAddress, int) 
At RegistryDNS.java:[line 900] 
   Exceptional return value of 
java.util.concurrent.ExecutorService.submit(Callable) ignored in 
org.apache.hadoop.registry.server.dns.RegistryDNS.addNIOUDP(InetAddress, int) 
At RegistryDNS.java:ignored in 
org.apache.hadoop.registry.server.dns.RegistryDNS.addNIOUDP(InetAddress, int) 
At RegistryDNS.java:[line 926] 
   Exceptional return value of 
java.util.concurrent.ExecutorService.submit(Callable) ignored in 
org.apache.hadoop.registry.server.dns.RegistryDNS.serveNIOTCP(ServerSocketChannel,
 InetAddress, int) At RegistryDNS.java:ignored in 
org.apache.hadoop.registry.server.dns.RegistryDNS.serveNIOTCP(ServerSocketChannel,
 InetAddress, int) At RegistryDNS.java:[line 850] 

Failed CTEST tests :

   test_test_libhdfs_threaded_hdfs_static 
   test_libhdfs_threaded_hdfspp_test_shim_static 

Failed junit tests :

   hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized 
   hadoop.hdfs.TestLeaseRecovery2 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.yarn.server.resourcemanager.TestRMAdminService 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestQueueManagementDynamicEditPolicy
 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation 
   hadoop.yarn.server.resourcemanager.security.TestRMDelegationTokens 
   hadoop.yarn.client.api.impl.TestNMClient 
   
hadoop.yarn.server.timelineservice.reader.TestTimelineReaderWebServicesHBaseStorage
 
   hadoop.mapreduce.v2.hs.server.TestHSAdminServer 
   hadoop.streaming.TestStreamingBadRecords 
   hadoop.streaming.TestMultipleCachefiles 
   hadoop.streaming.TestMultipleArchiveFiles 
   hadoop.streaming.TestSymLink 
   hadoop.streaming.TestFileArgs 
   hadoop.mapred.gridmix.TestDistCacheEmulation 
   hadoop.mapred.gridmix.TestLoadJob 
   hadoop.mapred.gridmix.TestSleepJob 
   hadoop.mapred.gridmix.TestGridmixSubmission 
   hadoop.tools.TestDistCh 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/935/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/935/artifact/out/diff-compile-javac-root.txt
  [296K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/935/artifact/out/diff-checkstyle-root.txt
  [17M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/935/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   

[jira] [Created] (YARN-8935) stopped yarn service app does not show when "yarn app -list"

2018-10-23 Thread kyungwan nam (JIRA)
kyungwan nam created YARN-8935:
--

 Summary: stopped yarn service app does not show when "yarn app 
-list"
 Key: YARN-8935
 URL: https://issues.apache.org/jira/browse/YARN-8935
 Project: Hadoop YARN
  Issue Type: Bug
  Components: yarn-native-services
 Environment: stopped yarn service app can be re-started or destroyed 
even if it does not exist in RM
It should show including stopped yarn service app.

{code}
$ yarn app -list
18/10/23 15:24:19 INFO client.RMProxy: Connecting to ResourceManager at 
a.com/10.1.1.100:8050
18/10/23 15:24:19 INFO client.AHSProxy: Connecting to Application History 
server at a.com/10.1.1.100:10200
Total number of applications (application-types: [], states: [SUBMITTED, 
ACCEPTED, RUNNING] and tags: []):0
Application-Id  Application-NameApplication-Type
  User   Queue   State Final-State  
   ProgressTracking-URL
$
$ yarn app -destroy ats-hbase
18/10/23 15:24:50 INFO client.RMProxy: Connecting to ResourceManager at 
a.com/10.1.1.100:8050
18/10/23 15:24:51 INFO client.AHSProxy: Connecting to Application History 
server at a.com/10.1.1.100:10200
18/10/23 15:24:51 INFO client.RMProxy: Connecting to ResourceManager at 
a.com/10.1.1.100:8050
18/10/23 15:24:51 INFO client.AHSProxy: Connecting to Application History 
server at a.com/10.1.1.100:10200
18/10/23 15:24:51 INFO util.log: Logging initialized @1617ms
18/10/23 15:24:52 INFO client.ApiServiceClient: Successfully destroyed service 
ats-hbase
{code}
Reporter: kyungwan nam






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org