[jira] [Commented] (MAPREDUCE-6876) FileInputFormat.listStatus should not fetch delegation tokens
[ https://issues.apache.org/jira/browse/MAPREDUCE-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969524#comment-15969524 ] Michael Gummelt commented on MAPREDUCE-6876: Yea, I really mean {{getSplits}}. And that proposal sounds perfect. > FileInputFormat.listStatus should not fetch delegation tokens > - > > Key: MAPREDUCE-6876 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6876 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Michael Gummelt > > {{FileInputFormat.listStatus}} fetches delegation tokens: > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L213 > AFAICT, this is unnecessary. {{listStatus}} doesn't delegate those tokens to > another process. This is causing issues described in the attached Spark > Kerberos ticket, because {{TokenCache.obtainTokensForNameNodes}}, which is > used to fetch the delegation tokens, assumes that certain MapReduce > configuration variables are set, which isn't true in the Spark calling code. > This is a separate problem, but nonetheless it wouldn't have arisen if > {{listStatus}} weren't fetching delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6876) FileInputFormat.listStatus should not fetch delegation tokens
[ https://issues.apache.org/jira/browse/MAPREDUCE-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969458#comment-15969458 ] Michael Gummelt commented on MAPREDUCE-6876: bq. The job submitting code does not know where the input lives nor how to grab tokens for it – that's the responsibility of the input format. That's fine, but it should be factored out into a separate method that the job submission code can then delegate to. {{listStatus}} does not require delegation tokens, so it shouldn't fetch delegation tokens. > FileInputFormat.listStatus should not fetch delegation tokens > - > > Key: MAPREDUCE-6876 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6876 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Michael Gummelt > > {{FileInputFormat.listStatus}} fetches delegation tokens: > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L213 > AFAICT, this is unnecessary. {{listStatus}} doesn't delegate those tokens to > another process. This is causing issues described in the attached Spark > Kerberos ticket, because {{TokenCache.obtainTokensForNameNodes}}, which is > used to fetch the delegation tokens, assumes that certain MapReduce > configuration variables are set, which isn't true in the Spark calling code. > This is a separate problem, but nonetheless it wouldn't have arisen if > {{listStatus}} weren't fetching delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (MAPREDUCE-6876) FileInputFormat.listStatus should not fetch delegation tokens
[ https://issues.apache.org/jira/browse/MAPREDUCE-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969374#comment-15969374 ] Michael Gummelt edited comment on MAPREDUCE-6876 at 4/14/17 6:48 PM: - bq. The input format must obtain the necessary tokens for the tasks to be able to access the input splits, and this is how FileInputFormat accomplishes that. But the {{FileInputFormat}} is just fetching split information. It doesn't create tasks. So it shouldn't need to fetch delegation tokens. That should be the responsibility of the job submitting code. As it is, client code that is just creating a {{FileInputFormat}} in order to fetch split information, such as we do in Spark, wouldn't need to fetch delegation tokens. I'm not saying that delegation tokens aren't eventually needed for MapReduce jobs, it's just that this seems like the wrong place to fetch them. was (Author: mgummelt): bq. The input format must obtain the necessary tokens for the tasks to be able to access the input splits, and this is how FileInputFormat accomplishes that. But the {{FileInputFormat}} is just fetching split information. It doesn't create tasks. So it shouldn't need to fetch delegation tokens. That should be the responsibility of the job submitting code. As it is, client code that is just creating a {{FileInputFormat}} in order to fetch split information, such as we do in Spark, wouldn't need to fetch delegation tokens. > FileInputFormat.listStatus should not fetch delegation tokens > - > > Key: MAPREDUCE-6876 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6876 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Michael Gummelt > > {{FileInputFormat.listStatus}} fetches delegation tokens: > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L213 > AFAICT, this is unnecessary. {{listStatus}} doesn't delegate those tokens to > another process. This is causing issues described in the attached Spark > Kerberos ticket, because {{TokenCache.obtainTokensForNameNodes}}, which is > used to fetch the delegation tokens, assumes that certain MapReduce > configuration variables are set, which isn't true in the Spark calling code. > This is a separate problem, but nonetheless it wouldn't have arisen if > {{listStatus}} weren't fetching delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (MAPREDUCE-6876) FileInputFormat.listStatus should not fetch delegation tokens
[ https://issues.apache.org/jira/browse/MAPREDUCE-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969374#comment-15969374 ] Michael Gummelt edited comment on MAPREDUCE-6876 at 4/14/17 6:47 PM: - bq. The input format must obtain the necessary tokens for the tasks to be able to access the input splits, and this is how FileInputFormat accomplishes that. But the {{FileInputFormat}} is just fetching split information. It doesn't create tasks. So it shouldn't need to fetch delegation tokens. That should be the responsibility of the job submitting code. As it is, client code that is just creating a {{FileInputFormat}} in order to fetch split information, such as we do in Spark, wouldn't need to fetch delegation tokens. was (Author: mgummelt): bq. The input format must obtain the necessary tokens for the tasks to be able to access the input splits, and this is how FileInputFormat accomplishes that. But the {{FileInputFormat}} is just return split information. It don't create tasks. So it shouldn't need to fetch delegation tokens. That should be the responsibility of the job submitting code. As it is, client code that is just creating a {{FileInputFormat}} in order to fetch split information, such as we do in Spark, wouldn't need to fetch delegation tokens. > FileInputFormat.listStatus should not fetch delegation tokens > - > > Key: MAPREDUCE-6876 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6876 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Michael Gummelt > > {{FileInputFormat.listStatus}} fetches delegation tokens: > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L213 > AFAICT, this is unnecessary. {{listStatus}} doesn't delegate those tokens to > another process. This is causing issues described in the attached Spark > Kerberos ticket, because {{TokenCache.obtainTokensForNameNodes}}, which is > used to fetch the delegation tokens, assumes that certain MapReduce > configuration variables are set, which isn't true in the Spark calling code. > This is a separate problem, but nonetheless it wouldn't have arisen if > {{listStatus}} weren't fetching delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (MAPREDUCE-6876) FileInputFormat.listStatus should not fetch delegation tokens
[ https://issues.apache.org/jira/browse/MAPREDUCE-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969374#comment-15969374 ] Michael Gummelt edited comment on MAPREDUCE-6876 at 4/14/17 6:42 PM: - bq. The input format must obtain the necessary tokens for the tasks to be able to access the input splits, and this is how FileInputFormat accomplishes that. But the {{FileInputFormat}} is just return split information. It don't create tasks. So it shouldn't need to fetch delegation tokens. That should be the responsibility of the job submitting code. As it is, client code that is just creating a {{FileInputFormat}} in order to fetch split information, such as we do in Spark, wouldn't need to fetch delegation tokens. was (Author: mgummelt): > The input format must obtain the necessary tokens for the tasks to be able to > access the input splits, and this is how FileInputFormat accomplishes that. But the {{FileInputFormat}} is just return split information. It don't create tasks. So it shouldn't need to fetch delegation tokens. That should be the responsibility of the job submitting code. As it is, client code that is just creating a {{FileInputFormat}} in order to fetch split information, such as we do in Spark, wouldn't need to fetch delegation tokens. > FileInputFormat.listStatus should not fetch delegation tokens > - > > Key: MAPREDUCE-6876 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6876 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Michael Gummelt > > {{FileInputFormat.listStatus}} fetches delegation tokens: > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L213 > AFAICT, this is unnecessary. {{listStatus}} doesn't delegate those tokens to > another process. This is causing issues described in the attached Spark > Kerberos ticket, because {{TokenCache.obtainTokensForNameNodes}}, which is > used to fetch the delegation tokens, assumes that certain MapReduce > configuration variables are set, which isn't true in the Spark calling code. > This is a separate problem, but nonetheless it wouldn't have arisen if > {{listStatus}} weren't fetching delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Commented] (MAPREDUCE-6876) FileInputFormat.listStatus should not fetch delegation tokens
[ https://issues.apache.org/jira/browse/MAPREDUCE-6876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15969374#comment-15969374 ] Michael Gummelt commented on MAPREDUCE-6876: > The input format must obtain the necessary tokens for the tasks to be able to > access the input splits, and this is how FileInputFormat accomplishes that. But the {{FileInputFormat}} is just return split information. It don't create tasks. So it shouldn't need to fetch delegation tokens. That should be the responsibility of the job submitting code. As it is, client code that is just creating a {{FileInputFormat}} in order to fetch split information, such as we do in Spark, wouldn't need to fetch delegation tokens. > FileInputFormat.listStatus should not fetch delegation tokens > - > > Key: MAPREDUCE-6876 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6876 > Project: Hadoop Map/Reduce > Issue Type: Improvement >Reporter: Michael Gummelt > > {{FileInputFormat.listStatus}} fetches delegation tokens: > https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L213 > AFAICT, this is unnecessary. {{listStatus}} doesn't delegate those tokens to > another process. This is causing issues described in the attached Spark > Kerberos ticket, because {{TokenCache.obtainTokensForNameNodes}}, which is > used to fetch the delegation tokens, assumes that certain MapReduce > configuration variables are set, which isn't true in the Spark calling code. > This is a separate problem, but nonetheless it wouldn't have arisen if > {{listStatus}} weren't fetching delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org
[jira] [Created] (MAPREDUCE-6876) FileInputFormat.listStatus should not fetch delegation tokens
Michael Gummelt created MAPREDUCE-6876: -- Summary: FileInputFormat.listStatus should not fetch delegation tokens Key: MAPREDUCE-6876 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6876 Project: Hadoop Map/Reduce Issue Type: Improvement Reporter: Michael Gummelt {{FileInputFormat.listStatus}} fetches delegation tokens: https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L213 AFAICT, this is unnecessary. {{listStatus}} doesn't delegate those tokens to another process. This is causing issues described in the attached Spark Kerberos ticket, because {{TokenCache.obtainTokensForNameNodes}}, which is used to fetch the delegation tokens, assumes that certain MapReduce configuration variables are set, which isn't true in the Spark calling code. This is a separate problem, but nonetheless it wouldn't have arisen if {{listStatus}} weren't fetching delegation tokens. -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org