[jira] [Updated] (TEZ-1942) Number of tasks show in Tez UI with auto-reduce parallelism is misleading

2015-01-15 Thread Prakash Ramachandran (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prakash Ramachandran updated TEZ-1942:
--
Priority: Blocker  (was: Major)

 Number of tasks show in Tez UI with auto-reduce parallelism is misleading
 -

 Key: TEZ-1942
 URL: https://issues.apache.org/jira/browse/TEZ-1942
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.2
Reporter: Rajesh Balamohan
Assignee: Prakash Ramachandran
Priority: Blocker
 Attachments: Screen Shot 2015-01-14 at 9.18.21 AM.png, Screen Shot 
 2015-01-14 at 9.18.54 AM.png, TEZ-1942.1.patch, TEZ-1942.2.patch, 
 output.json, result_with_direct_vertex.png, result_with_primary_filter.png


 Ran a simple hive query (with tez) and --hiveconf 
 hive.tez.auto.reducer.parallelism=true .  This internally turns on tez's 
 auto reduce parallelism.  
 - Job started off with 1009 reduce tasks
 - Tez reduces the number of reducers to 253
 - Job completes successfully, but TEZ UI shows 1009 as the number of reducers 
 (and 253 tasks as successful tasks).  This can be a little misleading.
 I will attach the screenshots soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1934) TestAMRecovery may fail due to the execution order is not determined

2015-01-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278908#comment-14278908
 ] 

Hitesh Shah commented on TEZ-1934:
--

Not sure why one build worked fine and the latter didnt. Triggered test patch 
again. 

 TestAMRecovery may fail due to the execution order is not determined 
 -

 Key: TEZ-1934
 URL: https://issues.apache.org/jira/browse/TEZ-1934
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-1934-1.patch


 task_1 is not guaranteed to been scheduled before task_0, so task_1 may 
 finished before task_0. While in the current TestAMRecovery, the finish of 
 task_1 is treated as the finished signal of vertex ( only 2 tasks in this 
 vertex) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1951) Fix general findbugs warnings in tez-dag

2015-01-15 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-1951:
-
Attachment: TEZ-1951.2.patch

Addressed comments. Remove accidental indent edits in DAGAppMaster.

 Fix general findbugs warnings in tez-dag
 

 Key: TEZ-1951
 URL: https://issues.apache.org/jira/browse/TEZ-1951
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: TEZ-1951.1.patch, TEZ-1951.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-1941) Memory provided by *Context.getAvailableMemory needs to be setup explicitly

2015-01-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth reassigned TEZ-1941:
---

Assignee: Siddharth Seth

 Memory provided by *Context.getAvailableMemory needs to be setup explicitly
 ---

 Key: TEZ-1941
 URL: https://issues.apache.org/jira/browse/TEZ-1941
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth

 *Contexts.getAvailableMemory rely on Runtime..getMaxMemory(). This doesn't 
 work for memory scaling if multiple tasks are running within a JVM.
 Container sizes (sent over RPC) can be used for setting up this value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1966) TaskReporter and ContainerReporter should handle interrupts better

2015-01-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-1966:

Issue Type: Sub-task  (was: Bug)
Parent: TEZ-1876

 TaskReporter and ContainerReporter should handle interrupts better
 --

 Key: TEZ-1966
 URL: https://issues.apache.org/jira/browse/TEZ-1966
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth

 It looks like the interrupts will only be handled if these two are in the 
 process of sleeping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-1965) LocalContainerLauncher should interrupt running tasks rather than using Future.cancel

2015-01-15 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-1965:
---

 Summary: LocalContainerLauncher should interrupt running tasks 
rather than using Future.cancel
 Key: TEZ-1965
 URL: https://issues.apache.org/jira/browse/TEZ-1965
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth


Future.cancel causes the future.get() to throw a CancellationException - which 
effectively ends up ignoring any exceptions which may have been thrown by just 
Interrupting the task. That casues the NPE from TEZ-1962 to get lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1965) LocalContainerLauncher should interrupt running tasks rather than using Future.cancel

2015-01-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-1965:

Issue Type: Sub-task  (was: Improvement)
Parent: TEZ-1876

 LocalContainerLauncher should interrupt running tasks rather than using 
 Future.cancel
 -

 Key: TEZ-1965
 URL: https://issues.apache.org/jira/browse/TEZ-1965
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth

 Future.cancel causes the future.get() to throw a CancellationException - 
 which effectively ends up ignoring any exceptions which may have been thrown 
 by just Interrupting the task. That casues the NPE from TEZ-1962 to get lost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1421) MRCombiner throws NPE in MapredWordCount on master branch

2015-01-15 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-1421:
-
Target Version/s: 0.6.1  (was: 0.6.0)

 MRCombiner throws NPE in MapredWordCount on master branch
 -

 Key: TEZ-1421
 URL: https://issues.apache.org/jira/browse/TEZ-1421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Tsuyoshi OZAWA
Priority: Blocker

 I tested MapredWordCount against 70GB generated by RandowTextWriter. When a 
 Combiner runs, it throws NPE. It looks setCombinerClass doesn't work 
 correctly.
 {quote}
 Caused by: java.lang.RuntimeException: java.lang.NullPointerException
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runOldCombiner(MRCombiner.java:122)
 at org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:112)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager.runCombineProcessor(MergeManager.java:472)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager$InMemoryMerger.merge(MergeManager.java:605)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeThread.run(MergeThread.java:89)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-1966) TaskReporter and ContainerReporter should handle interrupts better

2015-01-15 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-1966:
---

 Summary: TaskReporter and ContainerReporter should handle 
interrupts better
 Key: TEZ-1966
 URL: https://issues.apache.org/jira/browse/TEZ-1966
 Project: Apache Tez
  Issue Type: Bug
Reporter: Siddharth Seth
Assignee: Siddharth Seth


It looks like the interrupts will only be handled if these two are in the 
process of sleeping.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1962) Running out of threads in tez local mode

2015-01-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279301#comment-14279301
 ] 

Hitesh Shah commented on TEZ-1962:
--

Please run test patch locally and update jira with results as git is down ( 
causes the precommit build to fail )

 Running out of threads in tez local mode
 

 Key: TEZ-1962
 URL: https://issues.apache.org/jira/browse/TEZ-1962
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Gunther Hagleitner
Assignee: Siddharth Seth
Priority: Critical
 Attachments: TEZ-1962.1.txt, stack5.txt


 I've been trying to port the hive ut to tez local mode. However, local mode 
 seems to leak threads which causes tests to crash after a while (oom). See 
 attached stack trace - there are a lot of TezChild threads still hanging 
 around.
 ([~sseth] as discussed offline)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1962) Running out of threads in tez local mode

2015-01-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279299#comment-14279299
 ] 

Hitesh Shah commented on TEZ-1962:
--

+1




 Running out of threads in tez local mode
 

 Key: TEZ-1962
 URL: https://issues.apache.org/jira/browse/TEZ-1962
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Gunther Hagleitner
Assignee: Siddharth Seth
Priority: Critical
 Attachments: TEZ-1962.1.txt, stack5.txt


 I've been trying to port the hive ut to tez local mode. However, local mode 
 seems to leak threads which causes tests to crash after a while (oom). See 
 attached stack trace - there are a lot of TezChild threads still hanging 
 around.
 ([~sseth] as discussed offline)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1962) Running out of threads in tez local mode

2015-01-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279391#comment-14279391
 ] 

Siddharth Seth commented on TEZ-1962:
-

There appear to be 9 release audit warnings after applying the patch.




{code}
{color:red}-1 overall{color}.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version ) warnings.

{color:red}-1 release audit{color}.  The applied patch generated 9 
release audit warnings.
{code}

I don't think the 9 rat warnings have anything to do with the patch. From a 
local rat test
{code}
Unapproved licenses:

  
/Users/sseth/work2/projects/tez/commit/incubator-tez/tez-plugins/tez-yarn-timeline-history/target/maven-status/maven-compiler-plugin/compile/default-compile/createdFiles.lst
  
/Users/sseth/work2/projects/tez/commit/incubator-tez/tez-plugins/tez-yarn-timeline-history/target/maven-status/maven-compiler-plugin/compile/default-compile/inputFiles.lst
  
/Users/sseth/work2/projects/tez/commit/incubator-tez/tez-plugins/tez-yarn-timeline-history/target/maven-status/maven-compiler-plugin/testCompile/default-testCompile/createdFiles.lst
  
/Users/sseth/work2/projects/tez/commit/incubator-tez/tez-plugins/tez-yarn-timeline-history/target/maven-status/maven-compiler-plugin/testCompile/default-testCompile/inputFiles.lst
  
/Users/sseth/work2/projects/tez/commit/incubator-tez/tez-plugins/tez-yarn-timeline-history/target/surefire-reports/org.apache.tez.dag.history.logging.ats.TestATSHistoryLoggingService-output.txt
  
/Users/sseth/work2/projects/tez/commit/incubator-tez/tez-plugins/tez-yarn-timeline-history/target/surefire-reports/org.apache.tez.dag.history.logging.ats.TestATSHistoryLoggingService.txt
  
/Users/sseth/work2/projects/tez/commit/incubator-tez/tez-plugins/tez-yarn-timeline-history/target/surefire-reports/org.apache.tez.dag.history.logging.ats.TestHistoryEventTimelineConversion.txt
  
/Users/sseth/work2/projects/tez/commit/incubator-tez/tez-plugins/tez-yarn-timeline-history/target/surefire-reports/TEST-org.apache.tez.dag.history.logging.ats.TestATSHistoryLoggingService.xml
  
/Users/sseth/work2/projects/tez/commit/incubator-tez/tez-plugins/tez-yarn-timeline-history/target/surefire-reports/TEST-org.apache.tez.dag.history.logging.ats.TestHistoryEventTimelineConversion.xml
{code}

Thanks for the review. Will commit once git is back up.

 Running out of threads in tez local mode
 

 Key: TEZ-1962
 URL: https://issues.apache.org/jira/browse/TEZ-1962
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Gunther Hagleitner
Assignee: Siddharth Seth
Priority: Critical
 Attachments: TEZ-1962.1.txt, stack5.txt


 I've been trying to port the hive ut to tez local mode. However, local mode 
 seems to leak threads which causes tests to crash after a while (oom). See 
 attached stack trace - there are a lot of TezChild threads still hanging 
 around.
 ([~sseth] as discussed offline)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1941) Memory provided by *Context.getAvailableMemory needs to be setup explicitly

2015-01-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279419#comment-14279419
 ] 

Hitesh Shah commented on TEZ-1941:
--

Any reason why the memory getter was not added to the execution context object 
itself? 

 Memory provided by *Context.getAvailableMemory needs to be setup explicitly
 ---

 Key: TEZ-1941
 URL: https://issues.apache.org/jira/browse/TEZ-1941
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1941.1.txt


 *Contexts.getAvailableMemory rely on Runtime..getMaxMemory(). This doesn't 
 work for memory scaling if multiple tasks are running within a JVM.
 Container sizes (sent over RPC) can be used for setting up this value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1951) Fix general findbugs warnings in tez-dag

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279521#comment-14279521
 ] 

Hadoop QA commented on TEZ-1951:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12692578/TEZ-1951.2.patch
  against master revision b723a05.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 68 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/38//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/38//artifact/patchprocess/newPatchFindbugsWarningstez-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/38//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/38//artifact/patchprocess/newPatchFindbugsWarningstez-mapreduce.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/38//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-internals.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/38//console

This message is automatically generated.

 Fix general findbugs warnings in tez-dag
 

 Key: TEZ-1951
 URL: https://issues.apache.org/jira/browse/TEZ-1951
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: TEZ-1951.1.patch, TEZ-1951.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1962) Running out of threads in tez local mode

2015-01-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-1962:

Attachment: TEZ-1962.1.branch_0.6.txt

Rebased patch for branch-0.6 and branch-0.5.

Committed to master. Running tests locally for this branch, and will commit 
after that.

 Running out of threads in tez local mode
 

 Key: TEZ-1962
 URL: https://issues.apache.org/jira/browse/TEZ-1962
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Gunther Hagleitner
Assignee: Siddharth Seth
Priority: Critical
 Attachments: TEZ-1962.1.branch_0.6.txt, TEZ-1962.1.txt, stack5.txt


 I've been trying to port the hive ut to tez local mode. However, local mode 
 seems to leak threads which causes tests to crash after a while (oom). See 
 attached stack trace - there are a lot of TezChild threads still hanging 
 around.
 ([~sseth] as discussed offline)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1879) Create local UGI instances for each task and the AM, when running in LocalMode

2015-01-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-1879:

Attachment: TEZ-1879.2.txt

Updated patch. Applies on top of TEZ-1962.

 Create local UGI instances for each task and the AM, when running in LocalMode
 --

 Key: TEZ-1879
 URL: https://issues.apache.org/jira/browse/TEZ-1879
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1879.1.txt, TEZ-1879.2.txt


 Modifying the client UGI can cause issues when the client tries to submit 
 another job - or has tokens already populated in the UGI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1962) Running out of threads in tez local mode

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279517#comment-14279517
 ] 

Hadoop QA commented on TEZ-1962:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12692579/TEZ-1962.1.txt
  against master revision b723a05.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 254 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/37//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/37//artifact/patchprocess/newPatchFindbugsWarningstez-mapreduce.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/37//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/37//artifact/patchprocess/newPatchFindbugsWarningstez-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/37//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-internals.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/37//console

This message is automatically generated.

 Running out of threads in tez local mode
 

 Key: TEZ-1962
 URL: https://issues.apache.org/jira/browse/TEZ-1962
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Gunther Hagleitner
Assignee: Siddharth Seth
Priority: Critical
 Attachments: TEZ-1962.1.txt, stack5.txt


 I've been trying to port the hive ut to tez local mode. However, local mode 
 seems to leak threads which causes tests to crash after a while (oom). See 
 attached stack trace - there are a lot of TezChild threads still hanging 
 around.
 ([~sseth] as discussed offline)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-1968) Tez UI - All vertices of DAG are not listed in vertices page

2015-01-15 Thread Rajesh Balamohan (JIRA)
Rajesh Balamohan created TEZ-1968:
-

 Summary: Tez UI - All vertices of DAG are not listed in vertices 
page
 Key: TEZ-1968
 URL: https://issues.apache.org/jira/browse/TEZ-1968
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan


DAG view and vertices view can be a little misleading at times.  I can see that 
Reducer 9 is part of the DAG and is listed in DAG view.  However, it is not 
getting listed in Vertices page.  I will attach the screen shots soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1968) Tez UI - All vertices of DAG are not listed in vertices page

2015-01-15 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-1968:
--
Attachment: Screen Shot 2015-01-16 at 5.11.27 AM.png
Screen Shot 2015-01-16 at 5.11.14 AM.png

Notice that Reducer 9 is available in DAG view, but not in vertices view.

 Tez UI - All vertices of DAG are not listed in vertices page
 

 Key: TEZ-1968
 URL: https://issues.apache.org/jira/browse/TEZ-1968
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
 Attachments: Screen Shot 2015-01-16 at 5.11.14 AM.png, Screen Shot 
 2015-01-16 at 5.11.27 AM.png


 DAG view and vertices view can be a little misleading at times.  I can see 
 that Reducer 9 is part of the DAG and is listed in DAG view.  However, it 
 is not getting listed in Vertices page.  I will attach the screen shots soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1879) Create local UGI instances for each task and the AM, when running in LocalMode

2015-01-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279589#comment-14279589
 ] 

Hitesh Shah commented on TEZ-1879:
--

Clarification: 

With the change, there is now: 

{code}
+this.appMasterUgi = UserGroupInformation
+.createRemoteUser(jobUserName);
{code}

Earlier, in local mode: 
appMasterUgi = UserGroupInformation.getCurrentUser();

Will this create any issues in local mode? 

 Create local UGI instances for each task and the AM, when running in LocalMode
 --

 Key: TEZ-1879
 URL: https://issues.apache.org/jira/browse/TEZ-1879
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1879.1.txt, TEZ-1879.2.txt, TEZ-1879.3.txt


 Modifying the client UGI can cause issues when the client tries to submit 
 another job - or has tokens already populated in the UGI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-1937) Reduce cost of merging ifiles in UnorderedPartitionedWriter

2015-01-15 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan reassigned TEZ-1937:
-

Assignee: Rajesh Balamohan

 Reduce cost of merging ifiles in UnorderedPartitionedWriter
 ---

 Key: TEZ-1937
 URL: https://issues.apache.org/jira/browse/TEZ-1937
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-1937.1.patch, TEZ-1937.WIP.patch


 Currently we iterate through all spilled files for merging.  This incurs 
 additional deserialization cost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1879) Create local UGI instances for each task and the AM, when running in LocalMode

2015-01-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279455#comment-14279455
 ] 

Hitesh Shah commented on TEZ-1879:
--

Comments: 

{code}
 appMaster.appMasterUgi = UserGroupInformation
 .createRemoteUser(jobUserName);
 appMaster.appMasterUgi.addCredentials(appMaster.amTokens);
{code}
  - Can the above ugi instantiation be done in the DAGAppMaster constructor 
itself? 

{code}
-  private Credentials amTokens = new Credentials(); // Filled during init
+  private final Credentials amTokens;
{code}
   - LocalClient uses amCredentials where as the AM itself uses amTokens - a 
bit confusing. Probably does not need a change right now but might be helpful 
for readability.

Mostly looks good. 


 Create local UGI instances for each task and the AM, when running in LocalMode
 --

 Key: TEZ-1879
 URL: https://issues.apache.org/jira/browse/TEZ-1879
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1879.1.txt, TEZ-1879.2.txt


 Modifying the client UGI can cause issues when the client tries to submit 
 another job - or has tokens already populated in the UGI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1941) Memory provided by *Context.getAvailableMemory needs to be setup explicitly

2015-01-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-1941:

Attachment: TEZ-1941.1.txt

Simple patch which moves the memory specifications over to main methods - 
instead of accessing Runtime.maxmemory from within context instances. Also 
fixes some findbugs-warnings in runtime-internals introduced by moving TezChild 
from tez-dag.
Applies on top of TEZ-1879.

[~hitesh] - please review.

 Memory provided by *Context.getAvailableMemory needs to be setup explicitly
 ---

 Key: TEZ-1941
 URL: https://issues.apache.org/jira/browse/TEZ-1941
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1941.1.txt


 *Contexts.getAvailableMemory rely on Runtime..getMaxMemory(). This doesn't 
 work for memory scaling if multiple tasks are running within a JVM.
 Container sizes (sent over RPC) can be used for setting up this value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1879) Create local UGI instances for each task and the AM, when running in LocalMode

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279424#comment-14279424
 ] 

Hadoop QA commented on TEZ-1879:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12692617/TEZ-1879.2.txt
  against master revision b723a05.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/36//console

This message is automatically generated.

 Create local UGI instances for each task and the AM, when running in LocalMode
 --

 Key: TEZ-1879
 URL: https://issues.apache.org/jira/browse/TEZ-1879
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1879.1.txt, TEZ-1879.2.txt


 Modifying the client UGI can cause issues when the client tries to submit 
 another job - or has tokens already populated in the UGI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1941) Memory provided by *Context.getAvailableMemory needs to be setup explicitly

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279425#comment-14279425
 ] 

Hadoop QA commented on TEZ-1941:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12692618/TEZ-1941.1.txt
  against master revision b723a05.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/35//console

This message is automatically generated.

 Memory provided by *Context.getAvailableMemory needs to be setup explicitly
 ---

 Key: TEZ-1941
 URL: https://issues.apache.org/jira/browse/TEZ-1941
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1941.1.txt


 *Contexts.getAvailableMemory rely on Runtime..getMaxMemory(). This doesn't 
 work for memory scaling if multiple tasks are running within a JVM.
 Container sizes (sent over RPC) can be used for setting up this value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279497#comment-14279497
 ] 

Siddharth Seth commented on TEZ-1661:
-

[~zjffdu] - the patch is required, however I don't think this thread blocks JVM 
shutdown since it's a daemon. Is there a way to reproduce this ?

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-1962) Running out of threads in tez local mode

2015-01-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279404#comment-14279404
 ] 

Hitesh Shah edited comment on TEZ-1962 at 1/15/15 10:29 PM:


I think you need to run mvn clean -Phadoop24 -P\!hadoop26 :) - not sure why 
target is not getting excluded though. 


was (Author: hitesh):
I think you need to run mvn clean -Phadoop24 -P\!hadoop26 :)

 Running out of threads in tez local mode
 

 Key: TEZ-1962
 URL: https://issues.apache.org/jira/browse/TEZ-1962
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Gunther Hagleitner
Assignee: Siddharth Seth
Priority: Critical
 Attachments: TEZ-1962.1.txt, stack5.txt


 I've been trying to port the hive ut to tez local mode. However, local mode 
 seems to leak threads which causes tests to crash after a while (oom). See 
 attached stack trace - there are a lot of TezChild threads still hanging 
 around.
 ([~sseth] as discussed offline)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1962) Running out of threads in tez local mode

2015-01-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279404#comment-14279404
 ] 

Hitesh Shah commented on TEZ-1962:
--

I think you need to run mvn clean -Phadoop24 -P\!hadoop26 :)

 Running out of threads in tez local mode
 

 Key: TEZ-1962
 URL: https://issues.apache.org/jira/browse/TEZ-1962
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Gunther Hagleitner
Assignee: Siddharth Seth
Priority: Critical
 Attachments: TEZ-1962.1.txt, stack5.txt


 I've been trying to port the hive ut to tez local mode. However, local mode 
 seems to leak threads which causes tests to crash after a while (oom). See 
 attached stack trace - there are a lot of TezChild threads still hanging 
 around.
 ([~sseth] as discussed offline)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1951) Fix general findbugs warnings in tez-dag

2015-01-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279525#comment-14279525
 ] 

Hitesh Shah commented on TEZ-1951:
--

Committing shortly.

 Fix general findbugs warnings in tez-dag
 

 Key: TEZ-1951
 URL: https://issues.apache.org/jira/browse/TEZ-1951
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: TEZ-1951.1.patch, TEZ-1951.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1879) Create local UGI instances for each task and the AM, when running in LocalMode

2015-01-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-1879:

Attachment: TEZ-1879.3.txt

Thanks for the review. Updated patch.

bq. Can the above ugi instantiation be done in the DAGAppMaster constructor 
itself?
Fixed.

bq. LocalClient uses amCredentials where as the AM itself uses amTokens - a bit 
confusing. Probably does not need a change right now but might be helpful for 
readability.
Will rename amTokens to amCredentials just before commit.

[~hitesh] - do you want to look at this again, or should i commit it.

 Create local UGI instances for each task and the AM, when running in LocalMode
 --

 Key: TEZ-1879
 URL: https://issues.apache.org/jira/browse/TEZ-1879
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1879.1.txt, TEZ-1879.2.txt, TEZ-1879.3.txt


 Modifying the client UGI can cause issues when the client tries to submit 
 another job - or has tokens already populated in the UGI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1962) Running out of threads in tez local mode

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279588#comment-14279588
 ] 

Hadoop QA commented on TEZ-1962:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  
http://issues.apache.org/jira/secure/attachment/12692641/TEZ-1962.1.branch_0.6.txt
  against master revision 2544b05.

{color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/40//console

This message is automatically generated.

 Running out of threads in tez local mode
 

 Key: TEZ-1962
 URL: https://issues.apache.org/jira/browse/TEZ-1962
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Gunther Hagleitner
Assignee: Siddharth Seth
Priority: Critical
 Attachments: TEZ-1962.1.branch_0.6.txt, TEZ-1962.1.txt, stack5.txt


 I've been trying to port the hive ut to tez local mode. However, local mode 
 seems to leak threads which causes tests to crash after a while (oom). See 
 attached stack trace - there are a lot of TezChild threads still hanging 
 around.
 ([~sseth] as discussed offline)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279593#comment-14279593
 ] 

Jeff Zhang commented on TEZ-1661:
-

[~sseth] It is not daemon thread. Also verify it through jstack.  
{code}
Thread-33 prio=5 tid=0x7fb553266800 nid=0x6307 runnable 
[0x0001153e2000]
   java.lang.Thread.State: RUNNABLE
at java.lang.Throwable.fillInStackTrace(Native Method)
at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
- locked 0x0007b05c6b40 (a java.lang.InterruptedException)
at java.lang.Throwable.init(Throwable.java:250)
at java.lang.Exception.init(Exception.java:54)
at java.lang.InterruptedException.init(InterruptedException.java:57)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
at 
java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
at 
java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
at 
org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:322)
at 
org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:316)
at java.lang.Thread.run(Thread.java:745)
{code}

bq. Is there a way to reproduce this ?
Add the following in TezExampleBase.createTezClient and remove  system.exit of 
WordCount.java can reproduce it.
{code}
tezConf.setBoolean(TezConfiguration.TEZ_LOCAL_MODE, true);
tezConf.set(fs.defaultFS, file:///);
tezConf.setBoolean(
TezRuntimeConfiguration.TEZ_RUNTIME_OPTIMIZE_LOCAL_FETCH, true);
{code}

Attach a new patch for changing the thread to daemon.



 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1949) Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges

2015-01-15 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated TEZ-1949:
-
Attachment: TEZ-1949.2.patch

Test-case to confirm that we can set these parameters successfully.

 Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges
 ---

 Key: TEZ-1949
 URL: https://issues.apache.org/jira/browse/TEZ-1949
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.5.2, 0.7.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Critical
 Attachments: TEZ-1949.1.patch, TEZ-1949.2.patch


 Tez configuration whitelisting is missing TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH 
 for broadcast edges (UnorderedKVInput).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1949) Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges

2015-01-15 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279796#comment-14279796
 ] 

Rajesh Balamohan commented on TEZ-1949:
---

+1. lgtm.

Very minor comment.  It is possible to set this using 
setConfiguration(Configuration conf). If needed, you can add it to the testcase 
(i.e adding TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH to Configuration and invoking 
setFromConfiguration() in builder).  

 Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges
 ---

 Key: TEZ-1949
 URL: https://issues.apache.org/jira/browse/TEZ-1949
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.5.2, 0.7.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Critical
 Attachments: TEZ-1949.1.patch, TEZ-1949.2.patch


 Tez configuration whitelisting is missing TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH 
 for broadcast edges (UnorderedKVInput).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1949) Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279797#comment-14279797
 ] 

Hadoop QA commented on TEZ-1949:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12692691/TEZ-1949.2.patch
  against master revision 880d4f3.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 68 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.runtime.library.common.shuffle.TestFetcher

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/44//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/44//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-internals.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/44//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/44//artifact/patchprocess/newPatchFindbugsWarningstez-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/44//artifact/patchprocess/newPatchFindbugsWarningstez-mapreduce.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/44//console

This message is automatically generated.

 Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges
 ---

 Key: TEZ-1949
 URL: https://issues.apache.org/jira/browse/TEZ-1949
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.5.2, 0.7.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Critical
 Attachments: TEZ-1949.1.patch, TEZ-1949.2.patch


 Tez configuration whitelisting is missing TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH 
 for broadcast edges (UnorderedKVInput).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-557) Documentation for configuring Tez

2015-01-15 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279166#comment-14279166
 ] 

Jonathan Eagles commented on TEZ-557:
-

Bumped this long standing issue to 0.6.1 after discussion with Hitesh.

 Documentation for configuring Tez
 -

 Key: TEZ-557
 URL: https://issues.apache.org/jira/browse/TEZ-557
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Priority: Blocker

 Add docs on how to set java opts, control log levels, etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1421) MRCombiner throws NPE in MapredWordCount on master branch

2015-01-15 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279173#comment-14279173
 ] 

Jonathan Eagles commented on TEZ-1421:
--

Thanks, [~ozawa]. I will spend some time trying to reproduce. As this has been 
a long standing issue that has existed for several releases, will target 0.6.1 
since it doesn't block the Tez UI from being release in 0.6.0.

 MRCombiner throws NPE in MapredWordCount on master branch
 -

 Key: TEZ-1421
 URL: https://issues.apache.org/jira/browse/TEZ-1421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Tsuyoshi OZAWA
Priority: Blocker

 I tested MapredWordCount against 70GB generated by RandowTextWriter. When a 
 Combiner runs, it throws NPE. It looks setCombinerClass doesn't work 
 correctly.
 {quote}
 Caused by: java.lang.RuntimeException: java.lang.NullPointerException
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runOldCombiner(MRCombiner.java:122)
 at org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:112)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager.runCombineProcessor(MergeManager.java:472)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager$InMemoryMerger.merge(MergeManager.java:605)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeThread.run(MergeThread.java:89)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (TEZ-1967) Add a monitoring API on DAGClient which returns after a time interval or on DAG state change

2015-01-15 Thread Siddharth Seth (JIRA)
Siddharth Seth created TEZ-1967:
---

 Summary: Add a monitoring API on DAGClient which returns after a 
time interval or on DAG state change
 Key: TEZ-1967
 URL: https://issues.apache.org/jira/browse/TEZ-1967
 Project: Apache Tez
  Issue Type: Improvement
Reporter: Siddharth Seth


To monitor a running DAG, clients end up using DAGClient.getDAGSstatus in a 
loop with a poll interval.
In the worst case, they find out about DAG completion, failure etc only after 
the poll interval.

Instead, an API can be added which waits on the AM for a specified interval, 
but can return earlier if the DAG state changes.

This will end up blocking RPC handlers - but that isn't a problem since we 
don't have many entities querying for DAG status.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1951) Fix general findbugs warnings in tez-dag

2015-01-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279276#comment-14279276
 ] 

Siddharth Seth commented on TEZ-1951:
-

+1. Looks good.

 Fix general findbugs warnings in tez-dag
 

 Key: TEZ-1951
 URL: https://issues.apache.org/jira/browse/TEZ-1951
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: TEZ-1951.1.patch, TEZ-1951.2.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1934) TestAMRecovery may fail due to the execution order is not determined

2015-01-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279328#comment-14279328
 ] 

Hitesh Shah commented on TEZ-1934:
--

+1. Looks good to commit ( after git comes back online )

 TestAMRecovery may fail due to the execution order is not determined 
 -

 Key: TEZ-1934
 URL: https://issues.apache.org/jira/browse/TEZ-1934
 Project: Apache Tez
  Issue Type: Bug
Reporter: Jeff Zhang
Assignee: Jeff Zhang
 Attachments: TEZ-1934-1.patch


 task_1 is not guaranteed to been scheduled before task_0, so task_1 may 
 finished before task_0. While in the current TestAMRecovery, the finish of 
 task_1 is treated as the finished signal of vertex ( only 2 tasks in this 
 vertex) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1937) Reduce cost of merging ifiles in UnorderedPartitionedWriter

2015-01-15 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-1937:
--
Attachment: TEZ-1937.1.patch

[~sseth] - Can you please have a look when you find time.

 Reduce cost of merging ifiles in UnorderedPartitionedWriter
 ---

 Key: TEZ-1937
 URL: https://issues.apache.org/jira/browse/TEZ-1937
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
 Attachments: TEZ-1937.1.patch, TEZ-1937.WIP.patch


 Currently we iterate through all spilled files for merging.  This incurs 
 additional deserialization cost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-1661:

Attachment: TEZ-1661-2.patch

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch, TEZ-1661-2.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1879) Create local UGI instances for each task and the AM, when running in LocalMode

2015-01-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279598#comment-14279598
 ] 

Siddharth Seth commented on TEZ-1879:
-

bq. Also, in cluster mode, I see this step missing: 
UserGroupInformation.setConfiguration(conf);  LocalClient was fixed to invoke 
this but nothing in the AppMaster for cluster mode.

Good catch. Revising the patch to include this.

 Create local UGI instances for each task and the AM, when running in LocalMode
 --

 Key: TEZ-1879
 URL: https://issues.apache.org/jira/browse/TEZ-1879
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1879.1.txt, TEZ-1879.2.txt, TEZ-1879.3.txt


 Modifying the client UGI can cause issues when the client tries to submit 
 another job - or has tokens already populated in the UGI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1937) Reduce cost of merging ifiles in UnorderedPartitionedWriter

2015-01-15 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-1937:
--
Attachment: TEZ-1937.2.patch

removed unrelated change in latest patch.

 Reduce cost of merging ifiles in UnorderedPartitionedWriter
 ---

 Key: TEZ-1937
 URL: https://issues.apache.org/jira/browse/TEZ-1937
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
Assignee: Rajesh Balamohan
 Attachments: TEZ-1937.1.patch, TEZ-1937.2.patch, TEZ-1937.WIP.patch


 Currently we iterate through all spilled files for merging.  This incurs 
 additional deserialization cost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1879) Create local UGI instances for each task and the AM, when running in LocalMode

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279673#comment-14279673
 ] 

Hadoop QA commented on TEZ-1879:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12692648/TEZ-1879.4.txt
  against master revision 2544b05.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 66 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/43//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/43//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/43//artifact/patchprocess/newPatchFindbugsWarningstez-mapreduce.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/43//artifact/patchprocess/newPatchFindbugsWarningstez-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/43//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-internals.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/43//console

This message is automatically generated.

 Create local UGI instances for each task and the AM, when running in LocalMode
 --

 Key: TEZ-1879
 URL: https://issues.apache.org/jira/browse/TEZ-1879
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1879.1.txt, TEZ-1879.2.txt, TEZ-1879.3.txt, 
 TEZ-1879.4.txt


 Modifying the client UGI can cause issues when the client tries to submit 
 another job - or has tokens already populated in the UGI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279646#comment-14279646
 ] 

Hadoop QA commented on TEZ-1661:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12692646/TEZ-1661-2.patch
  against master revision 2544b05.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 68 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/42//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/42//artifact/patchprocess/newPatchFindbugsWarningstez-examples.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/42//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/42//artifact/patchprocess/newPatchFindbugsWarningstez-mapreduce.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/42//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-internals.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/42//console

This message is automatically generated.

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch, TEZ-1661-2.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1879) Create local UGI instances for each task and the AM, when running in LocalMode

2015-01-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279710#comment-14279710
 ] 

Siddharth Seth commented on TEZ-1879:
-

[~hitesh] - do you want to take another look ?

 Create local UGI instances for each task and the AM, when running in LocalMode
 --

 Key: TEZ-1879
 URL: https://issues.apache.org/jira/browse/TEZ-1879
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1879.1.txt, TEZ-1879.2.txt, TEZ-1879.3.txt, 
 TEZ-1879.4.txt


 Modifying the client UGI can cause issues when the client tries to submit 
 another job - or has tokens already populated in the UGI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1879) Create local UGI instances for each task and the AM, when running in LocalMode

2015-01-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279595#comment-14279595
 ] 

Siddharth Seth commented on TEZ-1879:
-

No, it will not create any issues. The intent is to not use
Ugi.getCurrentUser - since that leads to all kinds of issues related to
sharing credentials via the UGI.




 Create local UGI instances for each task and the AM, when running in LocalMode
 --

 Key: TEZ-1879
 URL: https://issues.apache.org/jira/browse/TEZ-1879
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Siddharth Seth
Assignee: Siddharth Seth
 Attachments: TEZ-1879.1.txt, TEZ-1879.2.txt, TEZ-1879.3.txt


 Modifying the client UGI can cause issues when the client tries to submit 
 another job - or has tokens already populated in the UGI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279637#comment-14279637
 ] 

Siddharth Seth commented on TEZ-1661:
-

I can't reproduce this locally, but the patch looks good. +1. 

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch, TEZ-1661-2.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-1661:

Target Version/s: 0.7.0

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch, TEZ-1661-2.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1949) Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges

2015-01-15 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-1949:
-
Target Version/s: 0.7.0, 0.5.4, 0.6.1  (was: 0.7.0, 0.6.1)

 Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges
 ---

 Key: TEZ-1949
 URL: https://issues.apache.org/jira/browse/TEZ-1949
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.5.2, 0.7.0
Reporter: Gopal V
Assignee: Gopal V
 Attachments: TEZ-1949.1.patch


 Tez configuration whitelisting is missing TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH 
 for broadcast edges (UnorderedKVInput).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1949) Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges

2015-01-15 Thread Hitesh Shah (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hitesh Shah updated TEZ-1949:
-
Priority: Critical  (was: Major)

 Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges
 ---

 Key: TEZ-1949
 URL: https://issues.apache.org/jira/browse/TEZ-1949
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.5.2, 0.7.0
Reporter: Gopal V
Assignee: Gopal V
Priority: Critical
 Attachments: TEZ-1949.1.patch


 Tez configuration whitelisting is missing TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH 
 for broadcast edges (UnorderedKVInput).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang updated TEZ-1661:

Attachment: TEZ-1661-1.patch

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Jeff Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Zhang reassigned TEZ-1661:
---

Assignee: Jeff Zhang

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang

 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278574#comment-14278574
 ] 

Jeff Zhang commented on TEZ-1661:
-

asyncDelegateRequestThread in LocalTaskSchedulerService is not stopped when 
DAGAppMaster is shutdown in local mode (actually it also happens in non-local 
mode, but we will call system.exit when shutting tez am in non-local mode, so 
it would not hang in non-local mode). The tez-examples don't hang in local mode 
because we always call System.exit when the job is done as following. But it 
doesn't make sense to require user to always do that. Attach a patch for 
addressing this issue. [~sseth], [~jeagles] please help review. 
{code}
int res = ToolRunner.run(new Configuration(), new WordCount(), args);
System.exit(res);
{code}

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278503#comment-14278503
 ] 

Hadoop QA commented on TEZ-1661:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12692481/TEZ-1661-1.patch
  against master revision 61bb0f8.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 260 new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/31//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/31//artifact/patchprocess/newPatchFindbugsWarningstez-mapreduce.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/31//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/31//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-internals.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/31//artifact/patchprocess/newPatchFindbugsWarningstez-tests.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/31//artifact/patchprocess/newPatchFindbugsWarningstez-examples.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/31//console

This message is automatically generated.

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1421) MRCombiner throws NPE in MapredWordCount on master branch

2015-01-15 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278600#comment-14278600
 ] 

Tsuyoshi OZAWA commented on TEZ-1421:
-

Build Step: I'm using Hadoop 2.6.0 and Tez trunk.

{code}
# Tez
mvn package -Dhadoop.version=2.6.0
{code}

tez-site.xml is available here: https://gist.github.com/oza/88fb9449a1fdf83cfd15

I'll relaunch jobs with INFO-level logging. Please wait a moment.

 MRCombiner throws NPE in MapredWordCount on master branch
 -

 Key: TEZ-1421
 URL: https://issues.apache.org/jira/browse/TEZ-1421
 Project: Apache Tez
  Issue Type: Bug
Reporter: Tsuyoshi OZAWA
Priority: Blocker

 I tested MapredWordCount against 70GB generated by RandowTextWriter. When a 
 Combiner runs, it throws NPE. It looks setCombinerClass doesn't work 
 correctly.
 {quote}
 Caused by: java.lang.RuntimeException: java.lang.NullPointerException
 at 
 org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
 at 
 org.apache.tez.mapreduce.combine.MRCombiner.runOldCombiner(MRCombiner.java:122)
 at org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:112)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager.runCombineProcessor(MergeManager.java:472)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeManager$InMemoryMerger.merge(MergeManager.java:605)
 at 
 org.apache.tez.runtime.library.common.shuffle.impl.MergeThread.run(MergeThread.java:89)
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1421) MRCombiner throws NPE in MapredWordCount on master branch

2015-01-15 Thread Tsuyoshi OZAWA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278597#comment-14278597
 ] 

Tsuyoshi OZAWA commented on TEZ-1421:
-

Attaching a stack trace I faced recently:

{quote}
15/01/09 08:10:41 INFO mapreduce.Job:  map 98% reduce 0%
15/01/09 08:10:42 INFO mapreduce.Job:  map 99% reduce 0%
15/01/09 08:10:44 INFO mapreduce.Job:  map 100% reduce 0%
15/01/09 08:10:47 INFO mapreduce.Job: Job job_1420765352344_0022 failed with 
state FAILED due to: Vertex failed, vertexName=finalreduce, 
vertexId=vertex_1420765352344_0022_1_01, diagnostics=[Task failed, taskId=task_
1420765352344_0022_1_01_00, diagnostics=[TaskAttempt 0 failed, info=[Error: 
exceptionThrown=org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
 error in shuffle in MemtoDiskMerger [
initialmap]
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:347)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.lang.NullPointerException
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at 
org.apache.tez.mapreduce.combine.MRCombiner.runOldCombiner(MRCombiner.java:127)
at 
org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:117)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.runCombineProcessor(MergeManager.java:480)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager$InMemoryMerger.merge(MergeManager.java:615)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeThread.run(MergeThread.java:89)
Caused by: java.lang.NullPointerException
at 
java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:333)
at 
java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:988)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:123)
... 5 more
, errorMessage=Shuffle Runner 
Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
 error in shuffle in MemtoDiskMerger [initialmap]
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:347)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.lang.NullPointerException
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:131)
at 
org.apache.tez.mapreduce.combine.MRCombiner.runOldCombiner(MRCombiner.java:127)
at 
org.apache.tez.mapreduce.combine.MRCombiner.combine(MRCombiner.java:117)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager.runCombineProcessor(MergeManager.java:480)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeManager$InMemoryMerger.merge(MergeManager.java:615)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.MergeThread.run(MergeThread.java:89)
Caused by: java.lang.NullPointerException
at 
java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:333)
at 
java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:988)
at 
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:123)
... 5 more
], TaskAttempt 1 failed, info=[Error: 
exceptionThrown=org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
 error in shuffle in MemtoDiskMerger [initialmap]
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:347)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.call(Shuffle.java:327)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: 

[jira] [Comment Edited] (TEZ-1661) LocalTaskScheduler hangs when shutdown

2015-01-15 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278574#comment-14278574
 ] 

Jeff Zhang edited comment on TEZ-1661 at 1/15/15 12:14 PM:
---

asyncDelegateRequestThread in LocalTaskSchedulerService is not stopped when 
DAGAppMaster is shutdown in local mode (actually it also happens in non-local 
mode, but we will call system.exit when shutting tez am in non-local mode, so 
it would not hang in non-local mode). The tez-examples don't hang in local mode 
because we always call System.exit when the job is done as following. But it 
doesn't make sense to require user to always do that.
{code}
int res = ToolRunner.run(new Configuration(), new WordCount(), args);
System.exit(res);
{code}

 Attach a patch for addressing this issue. [~sseth], [~jeagles] please help 
review. 


was (Author: zjffdu):
asyncDelegateRequestThread in LocalTaskSchedulerService is not stopped when 
DAGAppMaster is shutdown in local mode (actually it also happens in non-local 
mode, but we will call system.exit when shutting tez am in non-local mode, so 
it would not hang in non-local mode). The tez-examples don't hang in local mode 
because we always call System.exit when the job is done as following. But it 
doesn't make sense to require user to always do that. Attach a patch for 
addressing this issue. [~sseth], [~jeagles] please help review. 
{code}
int res = ToolRunner.run(new Configuration(), new WordCount(), args);
System.exit(res);
{code}

 LocalTaskScheduler hangs when shutdown
 --

 Key: TEZ-1661
 URL: https://issues.apache.org/jira/browse/TEZ-1661
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.5.0
 Environment: Local Mode
Reporter: Oleg Zhurakousky
Assignee: Jeff Zhang
 Attachments: TEZ-1661-1.patch


 LocalTaskScheduler hangs on 'take' from the 'taskRequestQueue ' when 
 TezClient shuts down (e.g., TezClient.stop).
 Below is jstack output observed when running in Tez local mode:
 {code}
 Thread-53 prio=5 tid=0x7fc876d8f800 nid=0xac07 runnable 
 [0x00011df9]
java.lang.Thread.State: RUNNABLE
 at java.lang.Throwable.fillInStackTrace(Native Method)
 at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
 - locked 0x0007b6ce60a0 (a java.lang.InterruptedException)
 at java.lang.Throwable.init(Throwable.java:250)
 at java.lang.Exception.init(Exception.java:54)
 at java.lang.InterruptedException.init(InterruptedException.java:57)
 at 
 java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
 at 
 java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
 at 
 java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:535)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.processRequest(LocalTaskSchedulerService.java:310)
 at 
 org.apache.tez.dag.app.rm.LocalTaskSchedulerService$AsyncDelegateRequestHandler.run(LocalTaskSchedulerService.java:304)
 at java.lang.Thread.run(Thread.java:745)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1937) Reduce cost of merging ifiles in UnorderedPartitionedWriter

2015-01-15 Thread Rajesh Balamohan (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rajesh Balamohan updated TEZ-1937:
--
Attachment: TEZ-1937.WIP.patch

Attaching WIP patch. This should be faster than the current approach.  However, 
If we can move EOF_MARKER out of compressed block (in IFile.close() ), then we 
can just copy the compressed IFile content to another IFile.  

 Reduce cost of merging ifiles in UnorderedPartitionedWriter
 ---

 Key: TEZ-1937
 URL: https://issues.apache.org/jira/browse/TEZ-1937
 Project: Apache Tez
  Issue Type: Bug
Reporter: Rajesh Balamohan
 Attachments: TEZ-1937.WIP.patch


 Currently we iterate through all spilled files for merging.  This incurs 
 additional deserialization cost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1905) Fix findbugs warnings in tez-tests

2015-01-15 Thread Rajesh Balamohan (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14278628#comment-14278628
 ] 

Rajesh Balamohan commented on TEZ-1905:
---

lgtm. +1

 Fix findbugs warnings in tez-tests
 --

 Key: TEZ-1905
 URL: https://issues.apache.org/jira/browse/TEZ-1905
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Hitesh Shah
Assignee: Hitesh Shah
 Attachments: TEZ-1905.1.patch


 https://builds.apache.org/job/PreCommit-Tez-Build/8/artifact/patchprocess/newPatchFindbugsWarningstez-tests.html
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-557) Documentation for configuring Tez

2015-01-15 Thread Jonathan Eagles (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-557:

Priority: Critical  (was: Blocker)

 Documentation for configuring Tez
 -

 Key: TEZ-557
 URL: https://issues.apache.org/jira/browse/TEZ-557
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Priority: Critical

 Add docs on how to set java opts, control log levels, etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (TEZ-557) Documentation for configuring Tez

2015-01-15 Thread Jonathan Eagles (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279166#comment-14279166
 ] 

Jonathan Eagles edited comment on TEZ-557 at 1/15/15 7:30 PM:
--

Bumped this long standing issue to 0.6.1 and downgraded to critical after 
discussion with Hitesh.


was (Author: jeagles):
Bumped this long standing issue to 0.6.1 after discussion with Hitesh.

 Documentation for configuring Tez
 -

 Key: TEZ-557
 URL: https://issues.apache.org/jira/browse/TEZ-557
 Project: Apache Tez
  Issue Type: Bug
Reporter: Hitesh Shah
Priority: Critical

 Add docs on how to set java opts, control log levels, etc. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1962) Running out of threads in tez local mode

2015-01-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-1962:

Attachment: TEZ-1962.1.txt

Patch to fix this.

The main reason here is a NPE in a log line in case of an Interrupt. The 
exception causes TezChild.run to fall off without shutting down the executor 
and TaskReporter threads.

The patch fixes the NPE, adds some checks to ensure shutdown is called, and 
changes LocalContainerLauncher to invoke a TezChild shutdown in case of an 
error from TezChild.

I'm going to open a couple of follow up jiras to change the way tasks are 
cancelled.

Tested locally, and there's no hung threads after this.

[~hitesh] - please review.


 Running out of threads in tez local mode
 

 Key: TEZ-1962
 URL: https://issues.apache.org/jira/browse/TEZ-1962
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Gunther Hagleitner
Assignee: Siddharth Seth
Priority: Critical
 Attachments: TEZ-1962.1.txt, stack5.txt


 I've been trying to port the hive ut to tez local mode. However, local mode 
 seems to leak threads which causes tests to crash after a while (oom). See 
 attached stack trace - there are a lot of TezChild threads still hanging 
 around.
 ([~sseth] as discussed offline)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (TEZ-1962) Running out of threads in tez local mode

2015-01-15 Thread Siddharth Seth (JIRA)

 [ 
https://issues.apache.org/jira/browse/TEZ-1962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddharth Seth updated TEZ-1962:

Target Version/s: 0.6.0, 0.7.0, 0.5.4  (was: 0.7.0)

 Running out of threads in tez local mode
 

 Key: TEZ-1962
 URL: https://issues.apache.org/jira/browse/TEZ-1962
 Project: Apache Tez
  Issue Type: Sub-task
Reporter: Gunther Hagleitner
Assignee: Siddharth Seth
Priority: Critical
 Attachments: TEZ-1962.1.txt, stack5.txt


 I've been trying to port the hive ut to tez local mode. However, local mode 
 seems to leak threads which causes tests to crash after a while (oom). See 
 attached stack trace - there are a lot of TezChild threads still hanging 
 around.
 ([~sseth] as discussed offline)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-1949) Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges

2015-01-15 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-1949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14279687#comment-14279687
 ] 

Hitesh Shah commented on TEZ-1949:
--

Should get this into 0.5.4 as well as 0.6.0 if possible.

 Whitelist TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH for broadcast edges
 ---

 Key: TEZ-1949
 URL: https://issues.apache.org/jira/browse/TEZ-1949
 Project: Apache Tez
  Issue Type: Bug
Affects Versions: 0.6.0, 0.5.2, 0.7.0
Reporter: Gopal V
Assignee: Gopal V
 Attachments: TEZ-1949.1.patch


 Tez configuration whitelisting is missing TEZ_RUNTIME_OPTIMIZE_SHARED_FETCH 
 for broadcast edges (UnorderedKVInput).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)