[jira] [Moved] (YARN-8430) Some zip files passed with spark-submit --archives causing "invalid CEN header" error
[ https://issues.apache.org/jira/browse/YARN-8430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin moved SPARK-24559 to YARN-8430: -- Affects Version/s: (was: 2.2.0) Component/s: (was: Spark Submit) Workflow: no-reopen-closed, patch-avail (was: no-reopen-closed) Key: YARN-8430 (was: SPARK-24559) Project: Hadoop YARN (was: Spark) > Some zip files passed with spark-submit --archives causing "invalid CEN > header" error > - > > Key: YARN-8430 > URL: https://issues.apache.org/jira/browse/YARN-8430 > Project: Hadoop YARN > Issue Type: Bug >Reporter: James Porritt >Priority: Major > > I'm encountering an error when submitting some zip files to spark-submit > using --archive that are over 2Gb and have the zip64 flag set. > {{PYSPARK_PYTHON=./ROOT/myspark/bin/python > /usr/hdp/current/spark2-client/bin/spark-submit \}} > {{ --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=./ROOT/myspark/bin/python \}} > {{ --master=yarn \}} > {{ --deploy-mode=cluster \}} > {{ --driver-memory=4g \}} > {{ --archives=myspark.zip#ROOT \}} > {{ --num-executors=32 \}} > {{ --packages com.databricks:spark-avro_2.11:4.0.0 \}} > {{ foo.py}} > (As a bit of background, I'm trying to prepare files using the trick of > zipping a conda environment and passing the zip file via --archives, as per: > https://community.hortonworks.com/articles/58418/running-pyspark-with-conda-env.html) > myspark.zip is a zipped conda environment. It was created using python with > the zipfile pacakge. The files are stored without deflation and with the > zip64 flag set. foo.py is my application code. This normally works, but if > myspark.zip is greater than 2Gb and has the zip64 flag set I get: > java.util.zip.ZipException: invalid CEN header (bad signature) > There seems to be much written on the subject, and I was able to write Java > code that utilises the java.util.zip library that both does and doesn't > encounter this error for one of the problematic zip files. > Spark compile info: > {{Welcome to}} > {{ __}} > {{ / __/__ ___ _/ /__}} > {{ _\ \/ _ \/ _ `/ __/ '_/}} > {{ /___/ .__/\_,_/_/ /_/\_\ version 2.2.0.2.6.4.0-91}} > {{ /_/}} > {{Using Scala version 2.11.8, Java HotSpot(TM) 64-Bit Server VM, 1.8.0_112}} > {{Branch HEAD}} > {{Compiled by user jenkins on 2018-01-04T10:41:05Z}} > {{Revision a24017869f5450397136ee8b11be818e7cd3facb}} > {{Url g...@github.com:hortonworks/spark2.git}} > {{Type --help for more information.}} > YARN logs on console after above command. I've tried both > --deploy-mode=cluster and --deploy-mode=client. > {{18/06/13 16:00:22 WARN NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable}} > {{18/06/13 16:00:23 WARN DomainSocketFactory: The short-circuit local reads > feature cannot be used because libhadoop cannot be loaded.}} > {{18/06/13 16:00:23 INFO RMProxy: Connecting to ResourceManager at > myhost2.myfirm.com/10.87.11.17:8050}} > {{18/06/13 16:00:23 INFO Client: Requesting a new application from cluster > with 6 NodeManagers}} > {{18/06/13 16:00:23 INFO Client: Verifying our application has not requested > more than the maximum memory capability of the cluster (221184 MB per > container)}} > {{18/06/13 16:00:23 INFO Client: Will allocate AM container, with 18022 MB > memory including 1638 MB overhead}} > {{18/06/13 16:00:23 INFO Client: Setting up container launch context for our > AM}} > {{18/06/13 16:00:23 INFO Client: Setting up the launch environment for our AM > container}} > {{18/06/13 16:00:23 INFO Client: Preparing resources for our AM container}} > {{18/06/13 16:00:24 INFO Client: Use hdfs cache file as spark.yarn.archive > for HDP, > hdfsCacheFile:hdfs://myhost.myfirm.com:8020/hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz}} > {{18/06/13 16:00:24 INFO Client: Source and destination file systems are the > same. Not copying > hdfs://myhost.myfirm.com:8020/hdp/apps/2.6.4.0-91/spark2/spark2-hdp-yarn-archive.tar.gz}} > {{18/06/13 16:00:24 INFO Client: Uploading resource > file:/home/myuser/.ivy2/jars/com.databricks_spark-avro_2.11-4.0.0.jar -> > hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/com.databri}} > {{cks_spark-avro_2.11-4.0.0.jar}} > {{18/06/13 16:00:26 INFO Client: Uploading resource > file:/home/myuser/.ivy2/jars/org.slf4j_slf4j-api-1.7.5.jar -> > hdfs://myhost.myfirm.com:8020/user/myuser/.sparkStaging/application_1528901858967_0019/org.slf4j_slf4j-api-1.}} > {{7.5.jar}} > {{18/06/13 16:00:26 INFO Client: Uploading resource > file:/home/myuser/.ivy2/jars/org.apache.avro_avro-1.7.6.jar -> >
[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users
[ https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345976#comment-14345976 ] Marcelo Vanzin commented on YARN-2423: -- We'd rather not depend on unstable APIs. But in this context, what does Unstable mean? When ATS v2 is released, will all support for ATS v1 be removed? Are you gonna change all the APIs to work against v2, making code built against v1 effectively broken? I'd imagine that if v2 is really incompatible you'd add a new set of APIs and then deprecate v1 instead. The v1 APIs would be public, stable and deprecated at that point. TimelineClient should wrap all GET APIs to facilitate Java users Key: YARN-2423 URL: https://issues.apache.org/jira/browse/YARN-2423 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Robert Kanter Attachments: YARN-2423.004.patch, YARN-2423.005.patch, YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, YARN-2423.patch TimelineClient provides the Java method to put timeline entities. It's also good to wrap over all GET APIs (both entity and domain), and deserialize the json response into Java POJO objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users
[ https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337682#comment-14337682 ] Marcelo Vanzin commented on YARN-2423: -- Hi Vinod, I think the point that Robert was trying to make is that adding these APIs might force Yarn to maintain compatibility for it. So it would allow clients to code against the public API and have a reasonable expectation that it wouldn't break. But I understand that with the redesign it might be hard to maintain compatibility. I guess it's a choice you guys have to make, but the lack of a public, stable read API is definitely a barrier for Spark adopting this feature. (I understand we could write code to talk to the REST server directly, but you seem to imply that this approach would also run into compatibility issues after the redesign.) TimelineClient should wrap all GET APIs to facilitate Java users Key: YARN-2423 URL: https://issues.apache.org/jira/browse/YARN-2423 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Robert Kanter Attachments: YARN-2423.004.patch, YARN-2423.005.patch, YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, YARN-2423.patch TimelineClient provides the Java method to put timeline entities. It's also good to wrap over all GET APIs (both entity and domain), and deserialize the json response into Java POJO objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users
[ https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14337630#comment-14337630 ] Marcelo Vanzin commented on YARN-2423: -- Hi [~vinodkv], What are these APIs? Are they declared as InterfaceAudience(Public)? If they're not, we're not willing to use those in Spark, because there's no commitment from Yarn to keep them stable. The only TimelineClient I see on branch-2.6 has no methods to read data from the ATS. TimelineClient should wrap all GET APIs to facilitate Java users Key: YARN-2423 URL: https://issues.apache.org/jira/browse/YARN-2423 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Robert Kanter Attachments: YARN-2423.004.patch, YARN-2423.005.patch, YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, YARN-2423.patch TimelineClient provides the Java method to put timeline entities. It's also good to wrap over all GET APIs (both entity and domain), and deserialize the json response into Java POJO objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-2423) TimelineClient should wrap all GET APIs to facilitate Java users
[ https://issues.apache.org/jira/browse/YARN-2423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329436#comment-14329436 ] Marcelo Vanzin commented on YARN-2423: -- Hey everybody, Just wanted to point out that this bug is currently marked as a blocker for integration between Spark and the ATS. It would really great to avoid having to write our own REST client just to talk to the ATS, and if the same API can be used to support YARN-2928, even better. TimelineClient should wrap all GET APIs to facilitate Java users Key: YARN-2423 URL: https://issues.apache.org/jira/browse/YARN-2423 Project: Hadoop YARN Issue Type: Sub-task Reporter: Zhijie Shen Assignee: Robert Kanter Attachments: YARN-2423.004.patch, YARN-2423.005.patch, YARN-2423.006.patch, YARN-2423.007.patch, YARN-2423.patch, YARN-2423.patch, YARN-2423.patch TimelineClient provides the Java method to put timeline entities. It's also good to wrap over all GET APIs (both entity and domain), and deserialize the json response into Java POJO objects. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (YARN-2642) MiniYARNCluster's start() may return before configuration is updated
Marcelo Vanzin created YARN-2642: Summary: MiniYARNCluster's start() may return before configuration is updated Key: YARN-2642 URL: https://issues.apache.org/jira/browse/YARN-2642 Project: Hadoop YARN Issue Type: Bug Components: test Reporter: Marcelo Vanzin When starting a new MiniYARNCluster, it's possible that the start() method returns before the configuration has been updated by ClientRMService. If that happens, YarnConfiguration.RM_ADDRESS will contain the wrong RM address (generally with port 0), and that can cause tests to fail. More details: https://github.com/apache/spark/commit/fbe8e9856b23262193105e7bf86075f516f0db25 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-2444) Primary filters added after first submission not indexed, cause exceptions in logs.
[ https://issues.apache.org/jira/browse/YARN-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated YARN-2444: - Attachment: ats.java Primary filters added after first submission not indexed, cause exceptions in logs. --- Key: YARN-2444 URL: https://issues.apache.org/jira/browse/YARN-2444 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Affects Versions: 2.5.0 Reporter: Marcelo Vanzin Attachments: ats.java See attached code for an example. The code creates an entity with a primary filter, submits it to the ATS. After that, a new primary filter value is added and the entity is resubmitted. At that point two things can be seen: - Searching for the new primary filter value does not return the entity - The following exception shows up in the logs: {noformat} 14/08/22 11:33:42 ERROR webapp.TimelineWebServices: Error when verifying access for user dr.who (auth:SIMPLE) on the events of the timeline entity { id: testid-48625678-9cbb-4e71-87de-93c50be51d1a, type: test } org.apache.hadoop.yarn.exceptions.YarnException: Owner information of the timeline entity { id: testid-48625678-9cbb-4e71-87de-93c50be51d1a, type: test } is corrupted. at org.apache.hadoop.yarn.server.timeline.security.TimelineACLsManager.checkAccess(TimelineACLsManager.java:67) at org.apache.hadoop.yarn.server.timeline.webapp.TimelineWebServices.getEntities(TimelineWebServices.java:172) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-2444) Primary filters added after first submission not indexed, cause exceptions in logs.
[ https://issues.apache.org/jira/browse/YARN-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107269#comment-14107269 ] Marcelo Vanzin commented on YARN-2444: -- The following search causes the problem described above: {noformat}/ws/v1/timeline/test?primaryFilter=prop2:val2{noformat} The following one works as expected: {noformat}/ws/v1/timeline/test?primaryFilter=prop1:val1{noformat} Primary filters added after first submission not indexed, cause exceptions in logs. --- Key: YARN-2444 URL: https://issues.apache.org/jira/browse/YARN-2444 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Affects Versions: 2.5.0 Reporter: Marcelo Vanzin Attachments: ats.java See attached code for an example. The code creates an entity with a primary filter, submits it to the ATS. After that, a new primary filter value is added and the entity is resubmitted. At that point two things can be seen: - Searching for the new primary filter value does not return the entity - The following exception shows up in the logs: {noformat} 14/08/22 11:33:42 ERROR webapp.TimelineWebServices: Error when verifying access for user dr.who (auth:SIMPLE) on the events of the timeline entity { id: testid-48625678-9cbb-4e71-87de-93c50be51d1a, type: test } org.apache.hadoop.yarn.exceptions.YarnException: Owner information of the timeline entity { id: testid-48625678-9cbb-4e71-87de-93c50be51d1a, type: test } is corrupted. at org.apache.hadoop.yarn.server.timeline.security.TimelineACLsManager.checkAccess(TimelineACLsManager.java:67) at org.apache.hadoop.yarn.server.timeline.webapp.TimelineWebServices.getEntities(TimelineWebServices.java:172) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at com.sun.jersey.spi.container.JavaMethodInvokerFactory$1.invoke(JavaMethodInvokerFactory.java:60) {noformat} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (YARN-2445) ATS does not reflect changes to uploaded TimelineEntity
Marcelo Vanzin created YARN-2445: Summary: ATS does not reflect changes to uploaded TimelineEntity Key: YARN-2445 URL: https://issues.apache.org/jira/browse/YARN-2445 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Reporter: Marcelo Vanzin Priority: Minor Attachments: ats2.java If you make a change to the TimelineEntity and send it to the ATS, that change is not reflected in the stored data. For example, in the attached code, an existing primary filter is removed and a new one is added. When you retrieve the entity from the ATS, it only contains the old value: {noformat} {entities:[{events:[],entitytype:test,entity:testid-ad5380c0-090e-4982-8da8-21676fe4e9f4,starttime:1408746026958,relatedentities:{},primaryfilters:{oldprop:[val]},otherinfo:{}}]} {noformat} Perhaps this is what the design wanted, but from an API user standpoint, it's really confusing, since to upload events I have to upload the entity itself, and the changes are not reflected. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (YARN-2445) ATS does not reflect changes to uploaded TimelineEntity
[ https://issues.apache.org/jira/browse/YARN-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated YARN-2445: - Attachment: ats2.java ATS does not reflect changes to uploaded TimelineEntity --- Key: YARN-2445 URL: https://issues.apache.org/jira/browse/YARN-2445 Project: Hadoop YARN Issue Type: Bug Components: timelineserver Reporter: Marcelo Vanzin Priority: Minor Attachments: ats2.java If you make a change to the TimelineEntity and send it to the ATS, that change is not reflected in the stored data. For example, in the attached code, an existing primary filter is removed and a new one is added. When you retrieve the entity from the ATS, it only contains the old value: {noformat} {entities:[{events:[],entitytype:test,entity:testid-ad5380c0-090e-4982-8da8-21676fe4e9f4,starttime:1408746026958,relatedentities:{},primaryfilters:{oldprop:[val]},otherinfo:{}}]} {noformat} Perhaps this is what the design wanted, but from an API user standpoint, it's really confusing, since to upload events I have to upload the entity itself, and the changes are not reflected. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-941) RM Should have a way to update the tokens it has for a running application
[ https://issues.apache.org/jira/browse/YARN-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039318#comment-14039318 ] Marcelo Vanzin commented on YARN-941: - [~jianhe], just to give a little more context about my question: what I was really thinking is that, if there were a method for securely exchanging tokens, there would be no need for token renewal, since there wouldn't be an avenue of attack based on sniffing tokens off the network traffic. (Unless, at that point, you're worried about an attacker guessing the token and you want to pre-emptively get a new one from time to time, but that sounds like an unlikely attack - unless the token generator itself is weak.) RM Should have a way to update the tokens it has for a running application -- Key: YARN-941 URL: https://issues.apache.org/jira/browse/YARN-941 Project: Hadoop YARN Issue Type: Sub-task Reporter: Robert Joseph Evans Assignee: Xuan Gong Attachments: YARN-941.preview.2.patch, YARN-941.preview.3.patch, YARN-941.preview.4.patch, YARN-941.preview.patch When an application is submitted to the RM it includes with it a set of tokens that the RM will renew on behalf of the application, that will be passed to the AM when the application is launched, and will be used when launching the application to access HDFS to download files on behalf of the application. For long lived applications/services these tokens can expire, and then the tokens that the AM has will be invalid, and the tokens that the RM had will also not work to launch a new AM. We need to provide an API that will allow the RM to replace the current tokens for this application with a new set. To avoid any real race issues, I think this API should be something that the AM calls, so that the client can connect to the AM with a new set of tokens it got using kerberos, then the AM can inform the RM of the new set of tokens and quickly update its tokens internally to use these new ones. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-941) RM Should have a way to update the tokens it has for a running application
[ https://issues.apache.org/jira/browse/YARN-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039471#comment-14039471 ] Marcelo Vanzin commented on YARN-941: - Right, it's easy to mix up those. :-) Yes, since token renewal is already in place, there's no need to touch that at the moment (although you could argue it's less useful when your tokens are secure). But if we're adding more complexity to allow replacing tokens, it might be a good opportunity to look deeper at the underlying problem this feature is trying to solve. RM Should have a way to update the tokens it has for a running application -- Key: YARN-941 URL: https://issues.apache.org/jira/browse/YARN-941 Project: Hadoop YARN Issue Type: Sub-task Reporter: Robert Joseph Evans Assignee: Xuan Gong Attachments: YARN-941.preview.2.patch, YARN-941.preview.3.patch, YARN-941.preview.4.patch, YARN-941.preview.patch When an application is submitted to the RM it includes with it a set of tokens that the RM will renew on behalf of the application, that will be passed to the AM when the application is launched, and will be used when launching the application to access HDFS to download files on behalf of the application. For long lived applications/services these tokens can expire, and then the tokens that the AM has will be invalid, and the tokens that the RM had will also not work to launch a new AM. We need to provide an API that will allow the RM to replace the current tokens for this application with a new set. To avoid any real race issues, I think this API should be something that the AM calls, so that the client can connect to the AM with a new set of tokens it got using kerberos, then the AM can inform the RM of the new set of tokens and quickly update its tokens internally to use these new ones. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-941) RM Should have a way to update the tokens it has for a running application
[ https://issues.apache.org/jira/browse/YARN-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037850#comment-14037850 ] Marcelo Vanzin commented on YARN-941: - [~ste...@apache.org], thanks for the comments, but I understand the part about renewing the token. My question was more along the lines of: what prevents the attacker from getting the new token and using it? That's why I called it an attack mitigation feature. If an attacker gets a token, that particular token is only usable for a period of time. But it doesn't seem like there's anything that prevents the attack in the first place - so if an attacker is able to get the first token, he is able to get any future new tokens using exactly the same approach. I understand that renewing tokens is needed for long-running processes. I'm just trying to understand whether this is the right approach from a security perspective, and if it's not, if it wouldn't be good to spend some time thinking about a more secure way of exchanging these tokens. RM Should have a way to update the tokens it has for a running application -- Key: YARN-941 URL: https://issues.apache.org/jira/browse/YARN-941 Project: Hadoop YARN Issue Type: Sub-task Reporter: Robert Joseph Evans Assignee: Xuan Gong Attachments: YARN-941.preview.2.patch, YARN-941.preview.3.patch, YARN-941.preview.4.patch, YARN-941.preview.patch When an application is submitted to the RM it includes with it a set of tokens that the RM will renew on behalf of the application, that will be passed to the AM when the application is launched, and will be used when launching the application to access HDFS to download files on behalf of the application. For long lived applications/services these tokens can expire, and then the tokens that the AM has will be invalid, and the tokens that the RM had will also not work to launch a new AM. We need to provide an API that will allow the RM to replace the current tokens for this application with a new set. To avoid any real race issues, I think this API should be something that the AM calls, so that the client can connect to the AM with a new set of tokens it got using kerberos, then the AM can inform the RM of the new set of tokens and quickly update its tokens internally to use these new ones. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (YARN-941) RM Should have a way to update the tokens it has for a running application
[ https://issues.apache.org/jira/browse/YARN-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003796#comment-14003796 ] Marcelo Vanzin commented on YARN-941: - Apologies for jumping in the middle of the conversation. I don't have a lot of background into the Yarn code here, but from this bug and some internal discussions I have a question for people who are more familiar with this code: What is the purpose of this renewal mechanism? So far it seems to me that it's an attack mitigation feature. An attacker who is able to get the token would only be able to use it while the original application (i) is running and (ii) keeps renewing the token. if that's true, it sounds to me like the problem is actually that it's possible to sniff the token in the first place. Wouldn't it be better, at that point, to have a protocol that doesn't allow that? Either using full-blown encryption for the RPC channels, or if that's deemed too expensive, some mechanism where tokens are negotiated instead of sent in plain text over the wire. RM Should have a way to update the tokens it has for a running application -- Key: YARN-941 URL: https://issues.apache.org/jira/browse/YARN-941 Project: Hadoop YARN Issue Type: Sub-task Reporter: Robert Joseph Evans Assignee: Xuan Gong Attachments: YARN-941.preview.2.patch, YARN-941.preview.3.patch, YARN-941.preview.patch When an application is submitted to the RM it includes with it a set of tokens that the RM will renew on behalf of the application, that will be passed to the AM when the application is launched, and will be used when launching the application to access HDFS to download files on behalf of the application. For long lived applications/services these tokens can expire, and then the tokens that the AM has will be invalid, and the tokens that the RM had will also not work to launch a new AM. We need to provide an API that will allow the RM to replace the current tokens for this application with a new set. To avoid any real race issues, I think this API should be something that the AM calls, so that the client can connect to the AM with a new set of tokens it got using kerberos, then the AM can inform the RM of the new set of tokens and quickly update its tokens internally to use these new ones. -- This message was sent by Atlassian JIRA (v6.2#6252)