[jira] [Commented] (TEZ-4175) Consider removing YarnConfiguration where it's possible

2020-07-23 Thread Rajesh Balamohan (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17164098#comment-17164098
 ] 

Rajesh Balamohan commented on TEZ-4175:
---

[~abstractdog] , thanks for sharing the patch. It would be fine to fix it in 
DAGAppMaster.

But for changes like TezClient, it may need to be revisited. E.g TezYarnClient 
would need yarn conf for properly initing yarnClient.

Otherwise, it would be good to double confirm whether the TezConf passed to it 
already has the relevant yarn details.

https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java#L311
https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionState.java#L354

> Consider removing YarnConfiguration where it's possible
> ---
>
> Key: TEZ-4175
> URL: https://issues.apache.org/jira/browse/TEZ-4175
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: László Bodor
>Assignee: László Bodor
>Priority: Major
> Attachments: TEZ-4175.01.patch, TEZ-4175.02.patch, TEZ-4175.03.patch, 
> TEZ-4175.03.patch
>
>
> A comment in DAGAppmaster made me think that we don't need to rely on 
> [YarnConfiguration|https://github.com/apache/hadoop/blob/branch-3.1.3/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java]
>  in all cases, what if it can be replace with base Configuration...
> {code}
>   // TODO Does this really need to be a YarnConfiguration ?
>   Configuration conf = new Configuration(new YarnConfiguration());
> {code}
> In hadoop 3.1.3 source, I cannot see that it adds e.g. yarn-site as a 
> resource:
> {code}
>   public YarnConfiguration() {
> super();
>   }
>   
>   public YarnConfiguration(Configuration conf) {
> super(conf);
> if (! (conf instanceof YarnConfiguration)) {
>   this.reloadConfiguration();
> }
>   }
> {code}
> in current codebase:
> {code}
> grep -iRH "new YarnConfiguration" --include="*.java"
> tez-plugins/tez-history-parser/src/main/java/org/apache/tez/history/ATSImportTool.java:
> YarnConfiguration yarnConf = new YarnConfiguration(conf);
> tez-plugins/tez-aux-services/src/main/java/org/apache/tez/auxservices/ShuffleHandler.java:
> super.serviceInit(new YarnConfiguration(conf));
> tez-api/src/test/java/org/apache/tez/dag/api/client/rpc/TestDAGClient.java:   
>  YarnConfiguration yarnConf = new YarnConfiguration(tezConf);
> tez-api/src/test/java/org/apache/tez/dag/api/client/rpc/TestDAGClient.java:   
>  YarnConfiguration yarnConf = new YarnConfiguration(tezConf);
> tez-api/src/test/java/org/apache/tez/dag/api/client/rpc/TestDAGClient.java:   
>  YarnConfiguration yarnConf = new YarnConfiguration(tezConf);
> tez-api/src/test/java/org/apache/tez/client/TestTezClient.java:
> tezClient.init(new TezConfiguration(false), new YarnConfiguration());
> tez-api/src/main/java/org/apache/tez/client/TezClient.java:
> amConfig.setYarnConfiguration(new 
> YarnConfiguration(amConfig.getTezConfiguration()));
> tez-api/src/main/java/org/apache/tez/client/TezClient.java:
> amConfig.setYarnConfiguration(new 
> YarnConfiguration(amConfig.getTezConfiguration()));
> tez-api/src/main/java/org/apache/tez/client/TezClient.java:return 
> getDAGClient(appId, tezConf, new YarnConfiguration(tezConf), frameworkClient, 
> ugi);
> tez-tests/src/test/java/org/apache/tez/test/FaultToleranceTestRunner.java:
>   tezConf = new TezConfiguration(new YarnConfiguration());
> tez-tests/src/test/java/org/apache/tez/test/FaultToleranceTestRunner.java:
>tezConf = new TezConfiguration(new YarnConfiguration(this.conf));
> tez-mapreduce/src/test/java/org/apache/tez/mapreduce/hadoop/TestMRInputHelpers.java:
> Configuration testConf = new YarnConfiguration(
> tez-mapreduce/src/main/java/org/apache/tez/mapreduce/client/YARNRunner.java:  
>  this(conf, new ResourceMgrDelegate(new YarnConfiguration(conf)));
> tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java:
> Configuration conf = new Configuration(new YarnConfiguration());
> tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java:
> Configuration conf = new Configuration(new YarnConfiguration());
> tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java:
> Configuration tezConf = new Configuration(new YarnConfiguration());
> tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java:
> Configuration tezConf = new Configuration(new YarnConfiguration());
> tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java:
> Configuration tezConf = new Configuration(new YarnConfiguration());
> tez-dag/src/test/java/org/apache/tez/dag/app/rm/TestContainerReuse.java:
> Configuration 

[jira] [Commented] (TEZ-4204) Data race in RootInputInitializerManager

2020-07-23 Thread Ashutosh Chauhan (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163997#comment-17163997
 ] 

Ashutosh Chauhan commented on TEZ-4204:
---

+1

> Data race in RootInputInitializerManager
> 
>
> Key: TEZ-4204
> URL: https://issues.apache.org/jira/browse/TEZ-4204
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Blocker
> Attachments: TEZ-4204.1.patch, TEZ-4204.1.patch, TEZ-4204.2.patch
>
>
> After https://issues.apache.org/jira/browse/TEZ-4170 there is a data race for 
> initializerMap in RootInputInitializerManager. initializerMap should be 
> initialized before vertex state is set to initializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-4094) Change Task Interface to use List instead of Concrete ArrayList

2020-07-23 Thread TezQA (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163963#comment-17163963
 ] 

TezQA commented on TEZ-4094:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 25m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 12m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  2m  
5s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 31s{color} | {color:orange} tez-dag: The patch generated 1 new + 366 
unchanged - 1 fixed = 367 total (was 367) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  4m 
34s{color} | {color:green} tez-dag in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
18s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 51s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-TEZ-Build/496/artifact/out/Dockerfile |
| JIRA Issue | TEZ-4094 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12984551/TEZ-4094.2.patch |
| Optional Tests | dupname asflicense javac javadoc unit spotbugs findbugs 
checkstyle compile |
| uname | Linux 5dd6dd9b5833 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 2d7c60849 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-TEZ-Build/496/artifact/out/diff-checkstyle-tez-dag.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-TEZ-Build/496/testReport/ |
| Max. process+thread count | 183 (vs. ulimit of 5500) |
| modules | C: tez-dag U: tez-dag |
| Console output | 
https://builds.apache.org/job/PreCommit-TEZ-Build/496/console |
| versions | git=2.17.1 maven=3.6.0 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |


This message was automatically generated.



> Change Task Interface to use List instead of Concrete ArrayList
> ---
>
> Key: TEZ-4094
> URL: https://issues.apache.org/jira/browse/TEZ-4094
> 

[jira] [Commented] (TEZ-4070) SSLFactory not closed in DAGClientTimelineImpl caused native memory issues

2020-07-23 Thread TezQA (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163957#comment-17163957
 ] 

TezQA commented on TEZ-4070:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  1m  
0s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
57s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
15s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 12s{color} | {color:orange} tez-api: The patch generated 1 new + 66 
unchanged - 0 fixed = 67 total (was 66) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
44s{color} | {color:green} tez-api in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
 7s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 11m 54s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-TEZ-Build/497/artifact/out/Dockerfile |
| JIRA Issue | TEZ-4070 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13008318/TEZ-4070.04.patch |
| Optional Tests | dupname asflicense javac javadoc unit xml compile spotbugs 
findbugs checkstyle |
| uname | Linux 24be47f4145e 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 
10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | personality/tez.sh |
| git revision | master / 2d7c60849 |
| Default Java | Private Build-1.8.0_252-8u252-b09-1~18.04-b09 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-TEZ-Build/497/artifact/out/diff-checkstyle-tez-api.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-TEZ-Build/497/testReport/ |
| Max. process+thread count | 246 (vs. ulimit of 5500) |
| modules | C: tez-api U: tez-api |
| Console output | 
https://builds.apache.org/job/PreCommit-TEZ-Build/497/console |
| versions | git=2.17.1 maven=3.6.0 findbugs=3.0.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |


This message was automatically generated.



> SSLFactory not closed in DAGClientTimelineImpl caused native memory issues
> --
>
> Key: TEZ-4070
> URL: 

[jira] [Commented] (TEZ-4094) Change Task Interface to use List instead of Concrete ArrayList

2020-07-23 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163946#comment-17163946
 ] 

Jonathan Turner Eagles commented on TEZ-4094:
-

That piece of code is added to prevent allocation of all attempts and was added 
as a scalability feature. The code is messy. Perhaps a better comment and some 
code tidying would improve readability while keeping the runtime optimized for 
scale.

> Change Task Interface to use List instead of Concrete ArrayList
> ---
>
> Key: TEZ-4094
> URL: https://issues.apache.org/jira/browse/TEZ-4094
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: David Mollitor
>Priority: Minor
> Attachments: TEZ-4094.1.patch, TEZ-4094.2.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (TEZ-4070) SSLFactory not closed in DAGClientTimelineImpl caused native memory issues

2020-07-23 Thread Jonathan Turner Eagles (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Turner Eagles updated TEZ-4070:

Attachment: TEZ-4070.04.patch

> SSLFactory not closed in DAGClientTimelineImpl caused native memory issues
> --
>
> Key: TEZ-4070
> URL: https://issues.apache.org/jira/browse/TEZ-4070
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Xun REN
>Assignee: László Bodor
>Priority: Major
> Attachments: TEZ-4070.01.patch, TEZ-4070.02.patch, TEZ-4070.03.patch, 
> TEZ-4070.04.patch
>
>
> Hi,
> Recently, we're facing native memory issues on Redhat servers. It crashed 
> completely our servers. 
> *Context:*
> - HDP-2.6.5 
> - Redhat 7.4
> *Problem:*
> After upgrading from HDP-2.6.2 to HDP-2.6.5, after several days running, our 
> HiveServer2 can eat up to more than 100GB memory. However, we have configured 
> Xmx20G and MaxMetaspace to 10GB.
> After searching, we have found the similar issue here:
> https://issues.apache.org/jira/browse/YARN-5309
> This is fixed in the hadoop-common module. Our version includes already this 
> issue, however, we still have the problem.
> After searching, I have found that in the class 
> org.apache.tez.dag.api.client.TimelineReaderFactory of Tez, if HTTPS is used 
> for YARN, it will create SSLFactory which is not destroyed after utilization.
> TimelineReaderFactory is referenced in the class DAGClientTimelineImpl.
> If ATS is used and DAG is completed, the method switchToTimelineClient in the 
> class DAGClientImpl will be called. It will close the previous HTTPClient, 
> but not the SSLFactory inside. And the SSLFactory will create a thread for 
> each connection. Finally, we will get thousands of threads consuming a lot 
> native memories.
> {code:java}
> private void switchToTimelineClient() throws IOException, TezException {
>  realClient.close();
>  realClient = new DAGClientTimelineImpl(appId, dagId, conf, frameworkClient,
>  (int) (2 * PRINT_STATUS_INTERVAL_MILLIS));
>  if (LOG.isDebugEnabled()) {
>  LOG.debug("dag completed switching to DAGClientTimelineImpl");
>  }
> }{code}
> I have checked on another environment which is still on HDP-2.6.2, we also 
> have a lot of running threads holding by SSLFactory. That means the problem 
> is zoomed in the version HDP-2.6.5
>  
> *How to reproduce the problem:*
> 1. Use Tez as Hive execution engine
> 2. Launch a Beeline session for Hive
> 3. Do a select with a simple where clause on a table
> 4. Repeat steps 2-3 in order to open different connections (it is the case 
> for a shared cluster with multiple clients).
> Finally, you can check in the thread dump file, that a lot of threads are 
> named "Truststore reloader thread". And the native memory usage is very high 
> by doing the command "top" or "ps".
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-4094) Change Task Interface to use List instead of Concrete ArrayList

2020-07-23 Thread David Mollitor (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163941#comment-17163941
 ] 

David Mollitor commented on TEZ-4094:
-

Thanks.

Just struck me as odd that there's this coupling of 
{{TaskImpl.EMPTY_TASK_ATTEMPT_TEZ_EVENTS}} and using concrete classes and not 
using {{Collections#emptyList}}.  Just trying to simplify and make it a little 
more decoupled.

> Change Task Interface to use List instead of Concrete ArrayList
> ---
>
> Key: TEZ-4094
> URL: https://issues.apache.org/jira/browse/TEZ-4094
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: David Mollitor
>Priority: Minor
> Attachments: TEZ-4094.1.patch, TEZ-4094.2.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-4094) Change Task Interface to use List instead of Concrete ArrayList

2020-07-23 Thread Jonathan Turner Eagles (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163926#comment-17163926
 ] 

Jonathan Turner Eagles commented on TEZ-4094:
-

[~belugabehr], there has much work to ensure the scalability of tez. By 
allocating memory space for all attempts up front, we work against that goal.

As to the need for this change (changing from ArrayList to List in this case), 
it will need some better analysis to demonstrate the value proposition. 
Currently, I'm leaning toward closing as "Won't Fix"

> Change Task Interface to use List instead of Concrete ArrayList
> ---
>
> Key: TEZ-4094
> URL: https://issues.apache.org/jira/browse/TEZ-4094
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: David Mollitor
>Priority: Minor
> Attachments: TEZ-4094.1.patch, TEZ-4094.2.patch
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (TEZ-4204) Data race in RootInputInitializerManager

2020-07-23 Thread Jonathan Turner Eagles (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Turner Eagles updated TEZ-4204:

Priority: Blocker  (was: Major)

> Data race in RootInputInitializerManager
> 
>
> Key: TEZ-4204
> URL: https://issues.apache.org/jira/browse/TEZ-4204
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Mustafa Iman
>Assignee: Mustafa Iman
>Priority: Blocker
> Attachments: TEZ-4204.1.patch, TEZ-4204.1.patch, TEZ-4204.2.patch
>
>
> After https://issues.apache.org/jira/browse/TEZ-4170 there is a data race for 
> initializerMap in RootInputInitializerManager. initializerMap should be 
> initialized before vertex state is set to initializing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-3894) Tez intermediate outputs implicitly rely on permissive umask for shuffle

2020-07-23 Thread Tarek Abouzeid (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-3894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163589#comment-17163589
 ] 

Tarek Abouzeid commented on TEZ-3894:
-

Hi,

an update to this ticket, in Hortonworks HDP, the umask settings for TEZ was 
being fetched from the HDFS service umask setting where it was 077, changing it 
to 022 fixed the problem.

Best Regards, 

> Tez intermediate outputs implicitly rely on permissive umask for shuffle
> 
>
> Key: TEZ-3894
> URL: https://issues.apache.org/jira/browse/TEZ-3894
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jason Darrell Lowe
>Assignee: Jason Darrell Lowe
>Priority: Major
> Fix For: 0.9.2
>
> Attachments: TEZ-3894.001.patch
>
>
> Tez does not explicitly set the permissions of intermediate output files for 
> shuffle. In a secure cluster the shuffle service is running as a different 
> user than the task, so the output files require group readability in order to 
> serve up the data during the shuffle phase. If the umask is too restrictive 
> (e.g.: 077) then the task's file.out and file.out.index permissions can be 
> too restrictive to allow the shuffle handler to access them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (TEZ-4205) Support RM delegation token

2020-07-23 Thread TezQA (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163496#comment-17163496
 ] 

TezQA commented on TEZ-4205:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 28m 
37s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  7s{color} 
| {color:red} TEZ-4205 does not apply to master. Rebase required? Wrong Branch? 
See https://cwiki.apache.org/confluence/display/TEZ/How+to+Contribute+to+Tez 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://builds.apache.org/job/PreCommit-TEZ-Build/494/artifact/out/Dockerfile |
| JIRA Issue | TEZ-4205 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13008262/TEZ-4205-0.9.2.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-TEZ-Build/494/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |


This message was automatically generated.



> Support RM delegation token
> ---
>
> Key: TEZ-4205
> URL: https://issues.apache.org/jira/browse/TEZ-4205
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Eugene Chung
>Priority: Major
> Attachments: TEZ-4205-0.9.2.patch, TEZ-4205.01.patch
>
>
> I have a requirement to get some information from YARN Resource Manager like 
> [NodeReports|#getNodeReports-org.apache.hadoop.yarn.api.records.NodeState...-]].
> But on the kerberized cluster, I can't do it because of kerberos 
> authentication failure. 
> {code:java}
> 2020-05-26 14:29:03,044 [ERROR] [InputInitializer {Map 1} #0] 
> |mapreduce.MyInputFormat|: getNodeReports error 
> java.io.IOException: DestHost:destPort my-rm-address:9050 , 
> LocalHost:localPort my-node-address:0. Failed on local exception: 
> java.io.IOException: org.apache.hadoop.security.AccessControlException: 
> Client cannot authenticate via:[TOKEN, KERBEROS]
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
>  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1495)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>  at com.sun.proxy.$Proxy54.getClusterNodes(Unknown Source)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:319)
>  at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  at com.sun.proxy.$Proxy55.getClusterNodes(Unknown Source)
>  at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:614)
>  ...
>  at com.naver.mapreduce.MyInputFormat.getSplits(MyInputFormat.java:537)
>  ...
>  at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:512)
>  at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> 

[jira] [Commented] (TEZ-4205) Support RM delegation token

2020-07-23 Thread TezQA (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163481#comment-17163481
 ] 

TezQA commented on TEZ-4205:


| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} TEZ-4205 does not apply to master. Rebase required? Wrong Branch? 
See https://cwiki.apache.org/confluence/display/TEZ/How+to+Contribute+to+Tez 
for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | TEZ-4205 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/13008262/TEZ-4205-0.9.2.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-TEZ-Build/495/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.12.0 https://yetus.apache.org |


This message was automatically generated.



> Support RM delegation token
> ---
>
> Key: TEZ-4205
> URL: https://issues.apache.org/jira/browse/TEZ-4205
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Eugene Chung
>Priority: Major
> Attachments: TEZ-4205-0.9.2.patch, TEZ-4205.01.patch
>
>
> I have a requirement to get some information from YARN Resource Manager like 
> [NodeReports|#getNodeReports-org.apache.hadoop.yarn.api.records.NodeState...-]].
> But on the kerberized cluster, I can't do it because of kerberos 
> authentication failure. 
> {code:java}
> 2020-05-26 14:29:03,044 [ERROR] [InputInitializer {Map 1} #0] 
> |mapreduce.MyInputFormat|: getNodeReports error 
> java.io.IOException: DestHost:destPort my-rm-address:9050 , 
> LocalHost:localPort my-node-address:0. Failed on local exception: 
> java.io.IOException: org.apache.hadoop.security.AccessControlException: 
> Client cannot authenticate via:[TOKEN, KERBEROS]
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
>  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1495)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>  at com.sun.proxy.$Proxy54.getClusterNodes(Unknown Source)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:319)
>  at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  at com.sun.proxy.$Proxy55.getClusterNodes(Unknown Source)
>  at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:614)
>  ...
>  at com.naver.mapreduce.MyInputFormat.getSplits(MyInputFormat.java:537)
>  ...
>  at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:512)
>  at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  at 
> 

[jira] [Updated] (TEZ-4205) Support RM delegation token

2020-07-23 Thread Eugene Chung (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Chung updated TEZ-4205:
--
Attachment: TEZ-4205-0.9.2.patch

> Support RM delegation token
> ---
>
> Key: TEZ-4205
> URL: https://issues.apache.org/jira/browse/TEZ-4205
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Eugene Chung
>Priority: Major
> Attachments: TEZ-4205-0.9.2.patch, TEZ-4205.01.patch
>
>
> I have a requirement to get some information from YARN Resource Manager like 
> [NodeReports|#getNodeReports-org.apache.hadoop.yarn.api.records.NodeState...-]].
> But on the kerberized cluster, I can't do it because of kerberos 
> authentication failure. 
> {code:java}
> 2020-05-26 14:29:03,044 [ERROR] [InputInitializer {Map 1} #0] 
> |mapreduce.MyInputFormat|: getNodeReports error 
> java.io.IOException: DestHost:destPort my-rm-address:9050 , 
> LocalHost:localPort my-node-address:0. Failed on local exception: 
> java.io.IOException: org.apache.hadoop.security.AccessControlException: 
> Client cannot authenticate via:[TOKEN, KERBEROS]
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
>  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1495)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>  at com.sun.proxy.$Proxy54.getClusterNodes(Unknown Source)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:319)
>  at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  at com.sun.proxy.$Proxy55.getClusterNodes(Unknown Source)
>  at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:614)
>  ...
>  at com.naver.mapreduce.MyInputFormat.getSplits(MyInputFormat.java:537)
>  ...
>  at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:512)
>  at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
>  at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>  at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>  at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> 

[jira] [Updated] (TEZ-4205) Support RM delegation token

2020-07-23 Thread Eugene Chung (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Chung updated TEZ-4205:
--
Attachment: TEZ-4205.0.9.2.patch

> Support RM delegation token
> ---
>
> Key: TEZ-4205
> URL: https://issues.apache.org/jira/browse/TEZ-4205
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Eugene Chung
>Priority: Major
> Attachments: TEZ-4205-0.9.2.patch, TEZ-4205.01.patch
>
>
> I have a requirement to get some information from YARN Resource Manager like 
> [NodeReports|#getNodeReports-org.apache.hadoop.yarn.api.records.NodeState...-]].
> But on the kerberized cluster, I can't do it because of kerberos 
> authentication failure. 
> {code:java}
> 2020-05-26 14:29:03,044 [ERROR] [InputInitializer {Map 1} #0] 
> |mapreduce.MyInputFormat|: getNodeReports error 
> java.io.IOException: DestHost:destPort my-rm-address:9050 , 
> LocalHost:localPort my-node-address:0. Failed on local exception: 
> java.io.IOException: org.apache.hadoop.security.AccessControlException: 
> Client cannot authenticate via:[TOKEN, KERBEROS]
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
>  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1495)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>  at com.sun.proxy.$Proxy54.getClusterNodes(Unknown Source)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:319)
>  at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  at com.sun.proxy.$Proxy55.getClusterNodes(Unknown Source)
>  at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:614)
>  ...
>  at com.naver.mapreduce.MyInputFormat.getSplits(MyInputFormat.java:537)
>  ...
>  at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:512)
>  at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
>  at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>  at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>  at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> 

[jira] [Updated] (TEZ-4205) Support RM delegation token

2020-07-23 Thread Eugene Chung (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Chung updated TEZ-4205:
--
Attachment: (was: TEZ-4205.0.9.2.patch)

> Support RM delegation token
> ---
>
> Key: TEZ-4205
> URL: https://issues.apache.org/jira/browse/TEZ-4205
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Eugene Chung
>Priority: Major
> Attachments: TEZ-4205-0.9.2.patch, TEZ-4205.01.patch
>
>
> I have a requirement to get some information from YARN Resource Manager like 
> [NodeReports|#getNodeReports-org.apache.hadoop.yarn.api.records.NodeState...-]].
> But on the kerberized cluster, I can't do it because of kerberos 
> authentication failure. 
> {code:java}
> 2020-05-26 14:29:03,044 [ERROR] [InputInitializer {Map 1} #0] 
> |mapreduce.MyInputFormat|: getNodeReports error 
> java.io.IOException: DestHost:destPort my-rm-address:9050 , 
> LocalHost:localPort my-node-address:0. Failed on local exception: 
> java.io.IOException: org.apache.hadoop.security.AccessControlException: 
> Client cannot authenticate via:[TOKEN, KERBEROS]
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
>  at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
>  at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1495)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1437)
>  at org.apache.hadoop.ipc.Client.call(Client.java:1347)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
>  at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
>  at com.sun.proxy.$Proxy54.getClusterNodes(Unknown Source)
>  at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:319)
>  at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  at com.sun.proxy.$Proxy55.getClusterNodes(Unknown Source)
>  at 
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:614)
>  ...
>  at com.naver.mapreduce.MyInputFormat.getSplits(MyInputFormat.java:537)
>  ...
>  at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:512)
>  at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781)
>  at 
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
>  at 
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
>  at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
>  at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
>  at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: 
> 

[jira] [Commented] (TEZ-4128) Logging: Fix ArrayOutOfBound in PipelineSorter

2020-07-23 Thread Rajesh Balamohan (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17163447#comment-17163447
 ] 

Rajesh Balamohan commented on TEZ-4128:
---

[~rameshkumar]: Is this still an issue? I believe this was due to 
"maxNumberOfBlocks" being 0.

> Logging: Fix ArrayOutOfBound in PipelineSorter
> --
>
> Key: TEZ-4128
> URL: https://issues.apache.org/jira/browse/TEZ-4128
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Ramesh Kumar Thangarajan
>Assignee: Ramesh Kumar Thangarajan
>Priority: Major
> Attachments: TEZ-4128.1.patch
>
>
> Fix ArrayOutOfBound in PipelineSorter



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (TEZ-4205) Support RM delegation token

2020-07-23 Thread Eugene Chung (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Chung updated TEZ-4205:
--
Description: 
I have a requirement to get some information from YARN Resource Manager like 
[NodeReports|#getNodeReports-org.apache.hadoop.yarn.api.records.NodeState...-]].

But on the kerberized cluster, I can't do it because of kerberos authentication 
failure. 
{code:java}
2020-05-26 14:29:03,044 [ERROR] [InputInitializer {Map 1} #0] 
|mapreduce.MyInputFormat|: getNodeReports error 
java.io.IOException: DestHost:destPort my-rm-address:9050 , LocalHost:localPort 
my-node-address:0. Failed on local exception: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
 at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
 at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1495)
 at org.apache.hadoop.ipc.Client.call(Client.java:1437)
 at org.apache.hadoop.ipc.Client.call(Client.java:1347)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
 at com.sun.proxy.$Proxy54.getClusterNodes(Unknown Source)
 at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:319)
 at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
 at com.sun.proxy.$Proxy55.getClusterNodes(Unknown Source)
 at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:614)
 ...
 at com.naver.mapreduce.MyInputFormat.getSplits(MyInputFormat.java:537)
 ...
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:512)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781)
 at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
 at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
 at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
 at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
 at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
 at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
 at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
 at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
 at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:755)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
 at 
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:718)
 at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:811)
 at 

[jira] [Created] (TEZ-4205) Support RM delegation token

2020-07-23 Thread Eugene Chung (Jira)
Eugene Chung created TEZ-4205:
-

 Summary: Support RM delegation token
 Key: TEZ-4205
 URL: https://issues.apache.org/jira/browse/TEZ-4205
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Eugene Chung


I have a requirement to get some information from YARN Resource Manager like 
[NodeReports| 
[https://hadoop.apache.org/docs/r3.1.0/api/org/apache/hadoop/yarn/client/api/YarnClient.html#getNodeReports-org.apache.hadoop.yarn.api.records.NodeState...-]].

But on the kerberized cluster, I can't do it because of kerberos authentication 
failure.

 
{code:java}
2020-05-26 14:29:03,044 [ERROR] [InputInitializer {Map 1} #0] 
|mapreduce.MyInputFormat|: getNodeReports error 
java.io.IOException: DestHost:destPort my-rm-address:9050 , LocalHost:localPort 
my-node-address:0. Failed on local exception: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
 at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
 at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
 at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:806)
 at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1495)
 at org.apache.hadoop.ipc.Client.call(Client.java:1437)
 at org.apache.hadoop.ipc.Client.call(Client.java:1347)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
 at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
 at com.sun.proxy.$Proxy54.getClusterNodes(Unknown Source)
 at 
org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getClusterNodes(ApplicationClientProtocolPBClientImpl.java:319)
 at sun.reflect.GeneratedMethodAccessor30.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
 at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
 at com.sun.proxy.$Proxy55.getClusterNodes(Unknown Source)
 at 
org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getNodeReports(YarnClientImpl.java:614)
 ...
 at com.naver.mapreduce.MyInputFormat.getSplits(MyInputFormat.java:537)
 ...
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:512)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:781)
 at 
org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:243)
 at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:278)
 at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:269)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
 at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:269)
 at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:253)
 at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
 at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
 at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: 
org.apache.hadoop.security.AccessControlException: Client cannot authenticate 
via:[TOKEN, KERBEROS]
 at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:755)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 

[jira] [Comment Edited] (TEZ-4129) Delete intermediate attempt data for failed attempts for Shuffle Handler

2020-07-23 Thread Syed Shameerur Rahman (Jira)


[ 
https://issues.apache.org/jira/browse/TEZ-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17155180#comment-17155180
 ] 

Syed Shameerur Rahman edited comment on TEZ-4129 at 7/23/20, 7:23 AM:
--

[~jeagles] 
I have rebased the patch and created a pull request for the same 
https://github.com/apache/tez/pull/72


was (Author: srahman):
[~jeagles] ping for re-review request

> Delete intermediate attempt data for failed attempts for Shuffle Handler
> 
>
> Key: TEZ-4129
> URL: https://issues.apache.org/jira/browse/TEZ-4129
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Jonathan Turner Eagles
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Attachments: TEZ-4129.01.patch, TEZ-4129.02.patch, TEZ-4129.03.patch, 
> TEZ-4129.04.patch, TEZ-4129.05.patch, TEZ-4129.06.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (TEZ-4129) Delete intermediate attempt data for failed attempts for Shuffle Handler

2020-07-23 Thread Syed Shameerur Rahman (Jira)


 [ 
https://issues.apache.org/jira/browse/TEZ-4129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Syed Shameerur Rahman updated TEZ-4129:
---
Labels: pull-request-available  (was: ShuffleHandler)

> Delete intermediate attempt data for failed attempts for Shuffle Handler
> 
>
> Key: TEZ-4129
> URL: https://issues.apache.org/jira/browse/TEZ-4129
> Project: Apache Tez
>  Issue Type: Sub-task
>Reporter: Jonathan Turner Eagles
>Assignee: Syed Shameerur Rahman
>Priority: Major
>  Labels: pull-request-available
> Attachments: TEZ-4129.01.patch, TEZ-4129.02.patch, TEZ-4129.03.patch, 
> TEZ-4129.04.patch, TEZ-4129.05.patch, TEZ-4129.06.patch
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)