[
https://issues.apache.org/jira/browse/MAPREDUCE-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Erik.fang updated MAPREDUCE-4451:
---------------------------------
Attachment: MAPREDUCE-4451_branch-1.patch
Hi Alejandro,
Using a Blocking Queue and ThreadPoolExecutor can do the same job, while I
mostly ported the codes in EagerTaskInitializationListener used by early
version of fairscheduler.
Inspired by your comment, I look at the implementation of
ThreadPoolExecutor.execute(), and find the root cause.
The main problem is where the threads in ThreadPoolExecutor created. When a new
job arrives, threadPool.execute(new InitJob(jobInfo, job)) is called. If there
is less than corePoolSize threads in the threadPool, a new thread is created (
in ThreadPoolExecutor.addIfUnderCorePoolSize() ), which is in the UGI.doAs()
block as the RPC client remote user.
So we just warm up all the threads in the pool, using
threadPool.prestartAllCoreThreads().
This can be easily demonstrated:
{code}
public class testthreadpool {
ThreadPoolExecutor threadPool;
UserGroupInformation u1;
UserGroupInformation u2;
class task implements Runnable {
@Override
public void run() {
try {
UserGroupInformation u =
UserGroupInformation.getCurrentUser();
System.out.println(u.getUserName());
} catch (IOException e) {
e.printStackTrace();
}
}
}
public void init() throws IOException {
u1 = UserGroupInformation.getCurrentUser();
u2= UserGroupInformation.createRemoteUser("tony");
threadPool = (ThreadPoolExecutor)
Executors.newFixedThreadPool(1);
threadPool.prestartAllCoreThreads();
}
public void dowork() throws IOException, InterruptedException {
u2.doAs(new PrivilegedExceptionAction<Object>() {
public Object run() throws Exception {
threadPool.execute(new task());
return null;
}
});
}
public void done() {
threadPool.shutdown();
}
public static void main(String[] args) throws IOException,
InterruptedException {
testthreadpool t = new testthreadpool();
t.init();
t.dowork();
t.done();
}
}
{code}
the result is "erik"
comment threadPool.prestartAllCoreThreads(), and we got "tony"
new Patch uploaded
Thanks,
Erik
> fairscheduler fail to init job with kerberos authentication configured
> ----------------------------------------------------------------------
>
> Key: MAPREDUCE-4451
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4451
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/fair-share
> Affects Versions: 1.0.3
> Reporter: Erik.fang
> Fix For: 1.1.0
>
> Attachments: MAPREDUCE-4451_branch-1.patch,
> MAPREDUCE-4451_branch-1.patch, MAPREDUCE-4451_branch-1.patch,
> MAPREDUCE-4451_branch-1.patch
>
>
> Using FairScheduler in Hadoop 1.0.3 with kerberos authentication configured.
> Job initialization fails:
> {code}
> 2012-07-17 15:15:09,220 ERROR org.apache.hadoop.mapred.JobTracker: Job
> initialization failed:
> java.io.IOException: Call to /192.168.7.80:8020 failed on local exception:
> java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed
> [Caused by GSSException: No valid credentials provided (Mechanism level:
> Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client.wrapException(Client.java:1129)
> at org.apache.hadoop.ipc.Client.call(Client.java:1097)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
> at $Proxy7.getProtocolVersion(Unknown Source)
> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:411)
> at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:125)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:329)
> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:294)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
> at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1411)
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1429)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
> at
> org.apache.hadoop.security.Credentials.writeTokenStorageFile(Credentials.java:169)
> at
> org.apache.hadoop.mapred.JobInProgress.generateAndStoreTokens(JobInProgress.java:3558)
> at
> org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:696)
> at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:3911)
> at
> org.apache.hadoop.mapred.FairScheduler$JobInitializer$InitJob.run(FairScheduler.java:301)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS
> initiate failed [Caused by GSSException: No valid credentials provided
> (Mechanism level: Failed to find any Kerberos tgt)]
> at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:543)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> at
> org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:488)
> at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:590)
> at
> org.apache.hadoop.ipc.Client$Connection.access$2100(Client.java:187)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1228)
> at org.apache.hadoop.ipc.Client.call(Client.java:1072)
> ... 20 more
> Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by
> GSSException: No valid credentials provided (Mechanism level: Failed to find
> any Kerberos tgt)]
> at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
> at
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:134)
> at
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:385)
> at
> org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:187)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:583)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:580)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
> at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:579)
> ... 23 more
> Caused by: GSSException: No valid credentials provided (Mechanism level:
> Failed to find any Kerberos tgt)
> at
> sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:130)
> at
> sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:106)
> at
> sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:172)
> at
> sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:209)
> at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:195)
> at
> sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162)
> at
> com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175)
> ... 32 more
> {code}
> When a job is submitted, fairscheduler calls JobTracker.initJob, which calls
> JobInProgress.generateAndStoreTokens to write security keys to hdfs. However,
> the operation is involved in the server side rpc call path, using UGI created
> by UserGroupInformation.createRemoteUser in rpc server, which have no tgt.
> This should be done with UGI used by JobTracker.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira