[ 
https://issues.apache.org/jira/browse/HADOOP-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16269577#comment-16269577
 ] 

Daryn Sharp commented on HADOOP-15059:
--------------------------------------

*Long pondering:* The credentials format is not something that should have been 
trifled without careful compatibility considerations.  A major compatibility 
break like this should have been proceeded with a bridge release – ie. 
proactively add v0/v1 support to popular 2.x releases but continue to use the 
v0 format.  Flip the 3.x default format to v1 and everything works.  But it's 
too late.

I'm concerned the compatibility issues may not be confined to just yarn.  
Perhaps a bit contrived but I've seen enough absurd UGI/token handling that 
little ceases to amaze me anymore.
# 2.x/3.x code rewrites the creds, only updates the v0-file creds because 
that's all that existed since the dawn of security, leading to:
# If 2.x rewrites creds, 2.x subprocess works.
# If 2.x or 3.x rewrites creds, 3.x subprocess is oblivious to intended changes 
– it found the unchanged v1-file.
# If 3.x rewrites creds, 2.x subprocess blows up because the v0-file was 
incidentally changed to the v1 format.

For the sake of discussion: the ugi methods to read/write the creds would be 
the better place to transparently handle the dual-version files instead of all 
the yarn changes.  It won't avoid the aforementioned pitfalls, and it doesn't 
fix the stream methods.  Not to mention we are now stuck supporting two files.  
If the format changes again, do we have to support 3 files?  The whole point of 
writing the version into the file is to support multiple versions.
––
*Short answer:*  I think 3.0 has to be the bridge release and continue writing 
the v0 format but supporting reading v1 format.  3.1 or 3.2 can flip the 
default format to v1.  Changing the file format appears to be cosmetic so I 
don't see any harm in waiting for v1 to be the default.

> 3.0 deployment cannot work with old version MR tar ball which break rolling 
> upgrade
> -----------------------------------------------------------------------------------
>
>                 Key: HADOOP-15059
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15059
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: security
>            Reporter: Junping Du
>            Assignee: Jason Lowe
>            Priority: Blocker
>         Attachments: HADOOP-15059.001.patch, HADOOP-15059.002.patch, 
> HADOOP-15059.003.patch
>
>
> I tried to deploy 3.0 cluster with 2.9 MR tar ball. The MR job is failed 
> because following error:
> {noformat}
> 2017-11-21 12:42:50,911 INFO [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for 
> application appattempt_1511295641738_0003_000001
> 2017-11-21 12:42:51,070 WARN [main] org.apache.hadoop.util.NativeCodeLoader: 
> Unable to load native-hadoop library for your platform... using builtin-java 
> classes where applicable
> 2017-11-21 12:42:51,118 FATAL [main] 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> java.lang.RuntimeException: Unable to determine current user
>       at 
> org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:254)
>       at 
> org.apache.hadoop.conf.Configuration$Resource.<init>(Configuration.java:220)
>       at 
> org.apache.hadoop.conf.Configuration$Resource.<init>(Configuration.java:212)
>       at 
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:888)
>       at 
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1638)
> Caused by: java.io.IOException: Exception reading 
> /tmp/nm-local-dir/usercache/jdu/appcache/application_1511295641738_0003/container_e03_1511295641738_0003_01_000001/container_tokens
>       at 
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:208)
>       at 
> org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:907)
>       at 
> org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:820)
>       at 
> org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:689)
>       at 
> org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:252)
>       ... 4 more
> Caused by: java.io.IOException: Unknown version 1 in token storage.
>       at 
> org.apache.hadoop.security.Credentials.readTokenStorageStream(Credentials.java:226)
>       at 
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:205)
>       ... 8 more
> 2017-11-21 12:42:51,122 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting 
> with status 1: java.lang.RuntimeException: Unable to determine current user
> {noformat}
> I think it is due to token incompatiblity change between 2.9 and 3.0. As we 
> claim "rolling upgrade" is supported in Hadoop 3, we should fix this before 
> we ship 3.0 otherwise all MR running applications will get stuck during/after 
> upgrade.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to