[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185078#comment-14185078
 ] 

Hudson commented on YARN-2743:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #725 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/725/])
YARN-2743. Fixed a bug in ResourceManager that was causing RMDelegationToken 
identifiers to be tampered and thus causing app submission failures in secure 
mode. Contributed by Jian He. (vinodkv: rev 
018664550507981297fd9f91e29408e6b7801ea9)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/security/TestYARNTokenIdentifier.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/proto/server/yarn_security_token.proto
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenIdentifier.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/RMDelegationTokenIdentifierForTest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/proto/yarn_server_resourcemanager_recovery.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/client/YARNDelegationTokenIdentifier.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/RMDelegationTokenIdentifierData.java


 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Fix For: 2.6.0

 Attachments: YARN-2743.1.patch, YARN-2743.2.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185168#comment-14185168
 ] 

Hudson commented on YARN-2743:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1914 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1914/])
YARN-2743. Fixed a bug in ResourceManager that was causing RMDelegationToken 
identifiers to be tampered and thus causing app submission failures in secure 
mode. Contributed by Jian He. (vinodkv: rev 
018664550507981297fd9f91e29408e6b7801ea9)
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/RMDelegationTokenIdentifierData.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/proto/yarn_server_resourcemanager_recovery.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/security/TestYARNTokenIdentifier.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenIdentifier.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/client/YARNDelegationTokenIdentifier.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/proto/server/yarn_security_token.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/RMDelegationTokenIdentifierForTest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java


 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Fix For: 2.6.0

 Attachments: YARN-2743.1.patch, YARN-2743.2.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14185232#comment-14185232
 ] 

Hudson commented on YARN-2743:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1939 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1939/])
YARN-2743. Fixed a bug in ResourceManager that was causing RMDelegationToken 
identifiers to be tampered and thus causing app submission failures in secure 
mode. Contributed by Jian He. (vinodkv: rev 
018664550507981297fd9f91e29408e6b7801ea9)
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/proto/server/yarn_security_token.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/RMDelegationTokenIdentifierData.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/security/TestYARNTokenIdentifier.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/proto/yarn_server_resourcemanager_recovery.proto
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenIdentifier.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/RMDelegationTokenIdentifierForTest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/client/YARNDelegationTokenIdentifier.java
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java


 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Fix For: 2.6.0

 Attachments: YARN-2743.1.patch, YARN-2743.2.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-26 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184563#comment-14184563
 ] 

Vinod Kumar Vavilapalli commented on YARN-2743:
---

Okay, the patch looks good to me.

File YARN-2746 for the pending item.

Can you please validate if the test-failures are related or not?

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-2743.1.patch, YARN-2743.2.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-26 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184578#comment-14184578
 ] 

Vinod Kumar Vavilapalli commented on YARN-2743:
---

Never mind, I analyzed the test failures myself.

Two tests are failing
 - TestAggregatedLogFormat passes for me locally. Cannot find logs on Jenkins. 
Will follow up separately.
 - TestFairScheduler.testContinuousScheduling failure is tracked at YARN-2666.

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-2743.1.patch, YARN-2743.2.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184582#comment-14184582
 ] 

Hudson commented on YARN-2743:
--

FAILURE: Integrated in Hadoop-trunk-Commit #6347 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/6347/])
YARN-2743. Fixed a bug in ResourceManager that was causing RMDelegationToken 
identifiers to be tampered and thus causing app submission failures in secure 
mode. Contributed by Jian He. (vinodkv: rev 
018664550507981297fd9f91e29408e6b7801ea9)
* hadoop-yarn-project/CHANGES.txt
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/proto/server/yarn_security_token.proto
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/pom.xml
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/FileSystemRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/client/YARNDelegationTokenIdentifier.java
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenIdentifier.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/RMStateStoreTestBase.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/RMDelegationTokenIdentifierForTest.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/proto/yarn_server_resourcemanager_recovery.proto
* 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/ZKRMStateStore.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/test/java/org/apache/hadoop/yarn/security/TestYARNTokenIdentifier.java
* 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/recovery/records/RMDelegationTokenIdentifierData.java


 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-2743.1.patch, YARN-2743.2.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-26 Thread Xuan Gong (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184621#comment-14184621
 ] 

Xuan Gong commented on YARN-2743:
-

[~vinodkv] [~jianhe]
Test case failure:TestAggregatedLogFormat will be tracked in YARN-2747.

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Fix For: 2.6.0

 Attachments: YARN-2743.1.patch, YARN-2743.2.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-25 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184214#comment-14184214
 ] 

Jian He commented on YARN-2743:
---

This is broken by YARN-2615, which added an extra filed(renewDate) in the 
{{YARNDelegationTokenIdentifier}}.
The following code in {{AbstractDelegationTokenSecretManager}}, we first create 
the {{password}} based on the identifier(note that the password doesn’t have 
the extra field at this point), and then we call {{storeToken}} which invokes 
underlying RMStateStore implementation to store the token
{code}
byte[] password = createPassword(identifier.getBytes(), 
currentKey.getKey());
DelegationTokenInformation tokenInfo = new DelegationTokenInformation(now
+ tokenRenewInterval, password, getTrackingIdIfEnabled(identifier));
try {
  storeToken(identifier, tokenInfo);
} catch (IOException ioe) {
  LOG.error(Could not store token !!, ioe);
}
{code}
But inside the state-store implementation, we inserted an EXTRA field 
{{RenewDate}}in the YarnDelegationTokenIdentifier. This causes the previously 
password generated without the extra field, but the final identifier has the 
extra field
{code}
  rmDTIdentifier.setRenewDate(renewDate);
  rmDTIdentifier.write(tokenOut);
{code}

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-2743.1.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-25 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184219#comment-14184219
 ] 

Jian He commented on YARN-2743:
---

Uploaded a patch. 
- removed the extra renewDate field from YARNDelegationTokenIdentifier
- simplified YARNDelegationTokenIdentifier to clean up duplicate methods.
-  changed the default to be 0, as {{AbstractDelegationTokenIdentifier}} has 
the default value 0 for masterKeyid, (this is not a bug, just for consistency )
{code}
optional int32 masterKeyId = 7 [default = -1];
{code}
- added test in RMStateStoreTestBase, to make sure the token before store and 
after store is the same.

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-2743.1.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-25 Thread Vinod Kumar Vavilapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184243#comment-14184243
 ] 

Vinod Kumar Vavilapalli commented on YARN-2743:
---

Did a quick scan of the patch. We are rewriting parts of YARN-2615, sigh!

Comments:
 - You now made a custom format for the state-store write - this solution works 
but not ideal. You should create a separate store specific proto for storing 
tokens in the state-store that encompasses the client facing proto.
 -  masterKeyId in yarn_security_token.proto is now missing [default = -1]; 
Actually the original default in AbstractDelegationTokenIdentifier is zero, 
let's use the same (another bug in YARN-2615).
 - I had a bigger question with YARN-2615 - why is RMDelegationTokenIdentifier 
converted to proto? RM is the reader and writer for this ID, so no need for 
this to be a proto?
 - YARNDelegationTokenIdentifier
-- It is so much cleaner now!
-- AbstractDTId had a version, we dropped that in the protobuf 
serialization. We should just write it during the serialization and read it 
back? We can separate the JIRA though.
-- Because of the way all the setters are behaving now, we should 
synchronize readFields, writeFields and all the setters. This is a problem with 
AbstractDelegationTokenIdentifier too, so okay splitting this too into its own 
JIRA.
 - Not related to your patch, but YARNDelegationTokenIdentifier should be 
private, or limited-private?
 - RMDelegationTokenIdentifierForTest is completely unmaintainable, duplicating 
a lot of code. Let's fix it.

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-2743.1.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184263#comment-14184263
 ] 

Hadoop QA commented on YARN-2743:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677123/YARN-2743.1.patch
  against trunk revision c51e53d.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.metrics2.impl.TestMetricsSystemImpl
  org.apache.hadoop.yarn.logaggregation.TestAggregatedLogFormat
  
org.apache.hadoop.yarn.server.resourcemanager.TestClientRMTokens

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5564//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5564//console

This message is automatically generated.

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-2743.1.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-25 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184349#comment-14184349
 ] 

Jian He commented on YARN-2743:
---

bq. why is RMDelegationTokenIdentifier converted to proto? RM is the reader and 
writer for this ID, so no need for this to be a proto?
It’ll be useful in rolling upgrades, when RM1 and RM2 have different  versions 
of the RMDT. i.e. writer of RM1 and reader of RM2 are incompatible.
Fixed other comments.

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-2743.1.patch, YARN-2743.2.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184376#comment-14184376
 ] 

Hadoop QA commented on YARN-2743:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677142/YARN-2743.2.patch
  against trunk revision 65d95b1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.logaggregation.TestAggregatedLogFormat

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5565//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5565//console

This message is automatically generated.

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-2743.1.patch, YARN-2743.2.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-25 Thread Jian He (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184382#comment-14184382
 ] 

Jian He commented on YARN-2743:
---

failed test passing locally, unrelated, re-trigger jenkins.

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-2743.1.patch, YARN-2743.2.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184401#comment-14184401
 ] 

Hadoop QA commented on YARN-2743:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12677142/YARN-2743.2.patch
  against trunk revision 65d95b1.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:

  org.apache.hadoop.yarn.logaggregation.TestAggregatedLogFormat
  
org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-YARN-Build/5568//testReport/
Console output: https://builds.apache.org/job/PreCommit-YARN-Build/5568//console

This message is automatically generated.

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 -

 Key: YARN-2743
 URL: https://issues.apache.org/jira/browse/YARN-2743
 Project: Hadoop YARN
  Issue Type: Bug
  Components: resourcemanager
Affects Versions: 2.6.0
Reporter: Arpit Gupta
Assignee: Jian He
Priority: Blocker
 Attachments: YARN-2743.1.patch, YARN-2743.2.patch


 During our HA testing we have seen yarn jobs run via oozie fail with failed 
 to renew delegation token errors on secure clusters and digest mismatch 
 errors on un secure clusters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-24 Thread Arpit Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183902#comment-14183902
 ] 

Arpit Gupta commented on YARN-2743:
---

Here is the stack trace from un secure run

{code}
LogType: stderr
LogLength: 4637
Log Contents:
Failing Oozie Launcher, Main class 
[org.apache.oozie.action.hadoop.MapReduceMain], main() threw exception, 
DIGEST-MD5: digest response format violation. Mismatched response.
javax.security.sasl.SaslException: DIGEST-MD5: digest response format 
violation. Mismatched response. [Caused by 
org.apache.hadoop.ipc.RemoteException(javax.security.sasl.SaslException): 
DIGEST-MD5: digest response format violation. Mismatched response.]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
at 
org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104)
at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getNewApplication(ApplicationClientProtocolPBClientImpl.java:210)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy31.getNewApplication(Unknown Source)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNewApplication(YarnClientImpl.java:198)
at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.createApplication(YarnClientImpl.java:206)
at 
org.apache.hadoop.mapred.ResourceMgrDelegate.getNewJobID(ResourceMgrDelegate.java:185)
at org.apache.hadoop.mapred.YARNRunner.getNewJobID(YARNRunner.java:231)
at 
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:359)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1296)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1293)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1293)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:562)
at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:557)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:557)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:548)
at 
org.apache.oozie.action.hadoop.MapReduceMain.submitJob(MapReduceMain.java:108)
at 
org.apache.oozie.action.hadoop.MapReduceMain.run(MapReduceMain.java:66)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:39)
at 
org.apache.oozie.action.hadoop.MapReduceMain.main(MapReduceMain.java:37)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
{code}

 Yarn jobs via oozie fail with failed to renew token (secure) or digest 
 mismatch (unsecure) errors when RM is being killed
 

[jira] [Commented] (YARN-2743) Yarn jobs via oozie fail with failed to renew token (secure) or digest mismatch (unsecure) errors when RM is being killed

2014-10-24 Thread Arpit Gupta (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183899#comment-14183899
 ] 

Arpit Gupta commented on YARN-2743:
---

Here is the stack trace from a secure run

{code}
{code}
2014-10-24 07:36:07,002 INFO  delegation.AbstractDelegationTokenSecretManager 
(AbstractDelegationTokenSecretManager.java:renewToken(452)) - Token renewal for 
identifier: owner=hrt_qa, renewer=yarn, 
realUser=oozie/ip-172-31-6-149.ec2.inter...@example.com, 
issueDate=1414136163937, maxDate=1414740963937, sequenceNumber=2, 
masterKeyId=2; total currentTokens 2
2014-10-24 07:36:07,004 WARN  security.DelegationTokenRenewer 
(DelegationTokenRenewer.java:handleDTRenewerAppSubmitEvent(661)) - Unable to 
add the application to the delegation token renewer.
java.io.IOException: Failed to renew token: Kind: RM_DELEGATION_TOKEN, Service: 
IP:8032,IP:8032, Ident: (owner=hrt_qa, renewer=yarn, 
realUser=oozie/ip-172-31-6-149.ec2.inter...@example.com, 
issueDate=1414136163937, maxDate=1414740963937, sequenceNumber=2, masterKeyId=2)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:394)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.access$500(DelegationTokenRenewer.java:70)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.handleDTRenewerAppSubmitEvent(DelegationTokenRenewer.java:657)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$DelegationTokenRenewerRunnable.run(DelegationTokenRenewer.java:638)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.AccessControlException: yarn is trying to 
renew a token with wrong password
at 
org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:476)
at 
org.apache.hadoop.yarn.security.client.RMDelegationTokenIdentifier$Renewer.renew(RMDelegationTokenIdentifier.java:110)
at org.apache.hadoop.security.token.Token.renew(Token.java:377)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:477)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer$1.run(DelegationTokenRenewer.java:474)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.renewToken(DelegationTokenRenewer.java:473)
at 
org.apache.hadoop.yarn.server.resourcemanager.security.DelegationTokenRenewer.handleAppSubmitEvent(DelegationTokenRenewer.java:392)
... 6 more
2014-10-24 07:36:07,007 INFO  rmapp.RMAppImpl 
(RMAppImpl.java:rememberTargetTransitionsAndStoreState(962)) - Updating 
application application_1414136032036_0002 with final state: FAILED
2014-10-24 07:36:07,007 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(704)) - 
application_1414136032036_0002 State change from NEW to FINAL_SAVING
2014-10-24 07:36:07,008 INFO  recovery.RMStateStore 
(RMStateStore.java:transition(159)) - Updating info for app: 
application_1414136032036_0002
2014-10-24 07:36:07,062 INFO  recovery.ZKRMStateStore 
(ZKRMStateStore.java:processWatchEvent(874)) - Watcher event type: NodeCreated 
with state:SyncConnected for 
path:/rmstore/ZKRMStateRoot/RMAppRoot/application_1414136032036_0002 for 
Service org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in 
state org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: 
STARTED
2014-10-24 07:36:07,063 INFO  rmapp.RMAppImpl (RMAppImpl.java:handle(704)) - 
application_1414136032036_0002 State change from FINAL_SAVING to FAILED
2014-10-24 07:36:07,063 WARN  capacity.CapacityScheduler 
(CapacityScheduler.java:doneApplication(769)) - Couldn't find application 
application_1414136032036_0002
2014-10-24 07:36:07,063 WARN  resourcemanager.RMAuditLogger 
(RMAuditLogger.java:logFailure(262)) - USER=hrt_qa  OPERATION=Application 
Finished - Failed TARGET=RMAppManager RESULT=FAILURE  DESCRIPTION=App 
failed with state: FAILED   PERMISSIONS=Failed to renew token: Kind: 
RM_DELEGATION_TOKEN, Service: 172.31.6.151:8032,172.31.6.149:8032, Ident: 
(owner=hrt_qa, renewer=yarn, 
realUser=oozie/ip-172-31-6-149.ec2.inter...@example.com, 
issueDate=1414136163937, maxDate=1414740963937, sequenceNumber=2, 
masterKeyId=2)