[jira] [Commented] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701705#comment-14701705 ] Saikat commented on TEZ-2726: - [~bikassaha] yes. So is this a proper place to raise an exception? an AMUserCodeException by checking this condition before sending out the CDMEs in Edge.java sendTezEventToDestinationTasks() for a scatter gather edge. Handle invalid number of partitions for SCATTER-GATHER edge --- Key: TEZ-2726 URL: https://issues.apache.org/jira/browse/TEZ-2726 Project: Apache Tez Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Saikat Assignee: Saikat Encountered an issue where the source vertex has M task and sink vertex has N tasks (N M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. This resulted in sink vertex receiving DMEs with non existent targetIds. The fetchers for the sink vertex tasks then try to retrieve the map outputs and retrieve invalid headers due to exception in the ShuffleHandler. Possible fixes: 1. raise proper Tez Exception to indicate this invalid scenario. 2. or write appropriate empty partition bits, for the missing partitions before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2294) Add tez-site-template.xml with description of config properties
[ https://issues.apache.org/jira/browse/TEZ-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701821#comment-14701821 ] Siddharth Seth commented on TEZ-2294: - Couple of general questions. - Will this be published on the website along with the release. Should the template file be generated and checked into the repository ? - It looks like this is analyzing all classes for an annotation, and then generating appropriate files. It may be simpler to generate the annotations for specific files (TezConfiguration and TezRuntimeConfiguration) for now. Another side affect of scanning all files is the creation of the apidocs/config directory in all modules - even though no config files exist. - Double/float, int/long - differentiate between these ? - Can the type be inferred from the default value, when it exists. - Documentation on how to generate these files would be useful (outside of mvn site) Specifics - An empty index.html ends up getting generated in apidocs/config, which can be confusing. - {code} * @see a href=../../../../../../configs/TezRuntimeConfiguration.htmlDetailed Configuration Information/a{code} Not sure what this will end up referring to, and where. - TEZ_AM_RESOURCE_CPU_VCORES - type is string instead of integer. - TezRuntimeConfiguration has no type information. - Nits: Space between lines on the generated XML template. - The XML generator likely needs some escaping. It generated invalid XML at the moment for TezConfiguration () ConfigStandardDoclet - Has references like TEZ_SITE_XML, ENDS_WITH, TEZ_AM_STAGING_DIR. If using the ConfigurationProperty annotation, can all of these special cases be skipped ? Alternately, skip using ConfigurationProperty altogether. - Some commented out code in HtmlWriter and XmlWriter Add tez-site-template.xml with description of config properties --- Key: TEZ-2294 URL: https://issues.apache.org/jira/browse/TEZ-2294 Project: Apache Tez Issue Type: Improvement Reporter: Rajesh Balamohan Assignee: Hitesh Shah Attachments: TEZ-2294.4.patch, TEZ-2294.5.patch, TEZ-2294.6.patch, TEZ-2294.7.patch, TEZ-2294.wip.2.patch, TEZ-2294.wip.3.patch, TEZ-2294.wip.patch, TezConfiguration.html, TezRuntimeConfiguration.html, tez-default-template.xml, tez-runtime-default-template.xml Document all tez configs with descriptions and default values. Also, document MR configs that can be easily translated to Tez configs via Tez helpers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701684#comment-14701684 ] Bikas Saha commented on TEZ-2726: - Ah. so the producer task wrote data and that generated a composite event. the edge was scatter-gather. so it expanded that event based on the number of downstream tasks (where num tasks == num partitions). So each downstream task got an input with a different partition index. So the ones that got indices 1 and 2 got the exception. Handle invalid number of partitions for SCATTER-GATHER edge --- Key: TEZ-2726 URL: https://issues.apache.org/jira/browse/TEZ-2726 Project: Apache Tez Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Saikat Assignee: Saikat Encountered an issue where the source vertex has M task and sink vertex has N tasks (N M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. This resulted in sink vertex receiving DMEs with non existent targetIds. The fetchers for the sink vertex tasks then try to retrieve the map outputs and retrieve invalid headers due to exception in the ShuffleHandler. Possible fixes: 1. raise proper Tez Exception to indicate this invalid scenario. 2. or write appropriate empty partition bits, for the missing partitions before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2164) Shade the guava version used by Tez
[ https://issues.apache.org/jira/browse/TEZ-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701884#comment-14701884 ] Siddharth Seth commented on TEZ-2164: - {code} [ERROR] Failed to execute goal on project tez-api: Could not resolve dependencies for project org.apache.tez:tez-api:jar:0.8.0-SNAPSHOT: Failure to find org.apache.tez:guava-tez:jar:18.0 in https://repository.apache.org/content/repositories/snapshots was cached in the local repository, resolution will not be reattempted until the update interval of apache.snapshots.https has elapsed or updates are forced - [Help 1] {code} Including guava-tez in the modules set in the top level pom gets further, but then fails with {code} [INFO] tez-job-analyzer .. SUCCESS [0.194s] [INFO] tez-dist .. FAILURE [0.072s] [INFO] Tez ... SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 39.757s [INFO] Finished at: Tue Aug 18 12:54:32 PDT 2015 [INFO] Final Memory: 81M/480M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-assembly-plugin:2.4:single (package-tez) on project tez-dist: Failed to create assembly: Error creating assembly archive tez-dist: You must set at least one file. - [Help 1] [ERROR] {code} One question. Does the shade plugin allow usage of the original package names in code, and have the shading done post compile ? Otherwise, there'll be two options of each guava class - and we'll have to monitor each patch to avoid this. Shade the guava version used by Tez --- Key: TEZ-2164 URL: https://issues.apache.org/jira/browse/TEZ-2164 Project: Apache Tez Issue Type: Improvement Reporter: Siddharth Seth Assignee: Hitesh Shah Priority: Critical Attachments: TEZ-2164.3.patch, TEZ-2164.wip.2.patch, allow-guava-16.0.1.patch Should allow us to upgrade to a newer version without shipping a guava dependency. Would be good to do this in 0.7 so that we stop shipping guava as early as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json
[ https://issues.apache.org/jira/browse/TEZ-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated TEZ-2730: - Attachment: TEZ-2730.1.patch tez-api missing dependency on org.codehaus.jettison for json - Key: TEZ-2730 URL: https://issues.apache.org/jira/browse/TEZ-2730 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah Attachments: TEZ-2730.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json
[ https://issues.apache.org/jira/browse/TEZ-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702032#comment-14702032 ] Hitesh Shah commented on TEZ-2730: -- [~sseth] [~bikassaha] review please. tez-api missing dependency on org.codehaus.jettison for json - Key: TEZ-2730 URL: https://issues.apache.org/jira/browse/TEZ-2730 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah Attachments: TEZ-2730.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json
[ https://issues.apache.org/jira/browse/TEZ-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah reassigned TEZ-2730: Assignee: Hitesh Shah tez-api missing dependency on org.codehaus.jettison for json - Key: TEZ-2730 URL: https://issues.apache.org/jira/browse/TEZ-2730 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2687) ATS History shutdown happens before the min-held containers are released
[ https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated TEZ-2687: Attachment: TEZ-2687-1.patch ATS History shutdown happens before the min-held containers are released Key: TEZ-2687 URL: https://issues.apache.org/jira/browse/TEZ-2687 Project: Apache Tez Issue Type: Bug Affects Versions: 0.6.2, 0.8.0, 0.7.1 Reporter: Gopal V Assignee: Jeff Zhang Attachments: TEZ-2687-1.patch When ATS goes into a GC pause under heavy loads and while it recovers, each Tez AM holds onto a few containers even though it is shutting down and will never accept any more DAGs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2687) ATS History shutdown happens before the min-held containers are released
[ https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702420#comment-14702420 ] Jeff Zhang commented on TEZ-2687: - Upload patch to fix it. [~hitesh] [~bikassaha] Please help review. * Release the held containers before the DAGAppMaster#stopServices is called * Release the container at once if new container is allocated when it is shutting down. Simulate the ATS hang behavior by adding Thread.sleep in HistoryEventHandler#serviceStop Here's the log without this patch. Container will only be released when it is expired. {noformat} 2015-08-19 12:05:01,645 INFO [IPC Server handler 0 on 49920] app.DAGAppMaster: DAGAppMasterShutdownHandler invoked 2015-08-19 12:05:01,645 INFO [IPC Server handler 0 on 49920] app.DAGAppMaster: Handling DAGAppMaster shutdown 2015-08-19 12:05:01,645 INFO [AMShutdownThread] app.DAGAppMaster: Sleeping for 5 seconds before shutting down 2015-08-19 12:05:06,646 INFO [AMShutdownThread] app.DAGAppMaster: Calling stop for all the services 2015-08-19 12:05:06,647 INFO [AMShutdownThread] history.HistoryEventHandler: Stopping HistoryEventHandler 2015-08-19 12:05:23,083 INFO [DelayedContainerManager] rm.YarnTaskSchedulerService: No taskRequests. Container's idle timeout delay expired or is new. Releasing container, containerId=container_1439946425329_0022_01_03, containerExpiryTime=1439957123078, idleTimeout=2, taskRequestsCount=0, heldContainers=4, delayedContainers=3, isNew=false 2015-08-19 12:05:23,083 INFO [DelayedContainerManager] rm.YarnTaskSchedulerService: Releasing unused container: container_1439946425329_0022_01_03 2015-08-19 12:05:23,083 INFO [Dispatcher thread: Central] history.HistoryEventHandler: [HISTORY][DAG:dag_1439946425329_0022_1][Event:CONTAINER_STOPPED]: containerId=container_1439946425329_0022_01_03, stoppedTime=1439957123083, exitStatus=0 2015-08-19 12:05:23,083 INFO [Dispatcher thread: Central] container.AMContainerImpl: AMContainer container_1439946425329_0022_01_03 transitioned from IDLE to STOP_REQUESTED via event C_STOP_REQUEST 2015-08-19 12:05:23,083 INFO [ContainerLauncher #6] launcher.ContainerLauncherImpl: Processing the event EventType: CONTAINER_STOP_REQUEST 2015-08-19 12:05:23,084 INFO [ContainerLauncher #6] launcher.ContainerLauncherImpl: Sending a stop request to the NM for ContainerId: container_1439946425329_0022_01_03 2015-08-19 12:05:23,084 INFO [ContainerLauncher #6] impl.ContainerManagementProtocolProxy: Opening proxy : 192.168.3.3:50421 2015-08-19 12:05:23,090 INFO [Dispatcher thread: Central] container.AMContainerImpl: AMContainer container_1439946425329_0022_01_03 transitioned from STOP_REQUESTED to STOPPING via event C_NM_STOP_SENT 2015-08-19 12:05:23,222 INFO [IPC Server handler 1 on 49919] app.TaskAttemptListenerImpTezDag: Container with id: container_1439946425329_0022_01_03 is valid, but no longer registered, and will be killed 2015-08-19 12:05:23,373 INFO [AMRM Callback Handler Thread] rm.YarnTaskSchedulerService: Released container completed:container_1439946425329_0022_01_03 last allocated to task: attempt_1439946425329_0022_1_02_03_0 2015-08-19 12:05:23,373 INFO [Dispatcher thread: Central] container.AMContainerImpl: Container container_1439946425329_0022_01_03 exited with diagnostics set to Container failed, exitCode=-100. Container released by application 2015-08-19 12:05:23,373 INFO [Dispatcher thread: Central] container.AMContainerImpl: AMContainer container_1439946425329_0022_01_03 transitioned from STOPPING to COMPLETED via event C_COMPLETED {noformat} Here's the log with this patch, containers will be released explicitly when shutdown is invoked. {noformat} 2015-08-19 12:07:37,137 INFO [IPC Server handler 0 on 50138] app.DAGAppMaster: DAGAppMasterShutdownHandler invoked 2015-08-19 12:07:37,137 INFO [IPC Server handler 0 on 50138] app.DAGAppMaster: Handling DAGAppMaster shutdown 2015-08-19 12:07:37,138 INFO [AMShutdownThread] app.DAGAppMaster: Sleeping for 5 seconds before shutting down 2015-08-19 12:07:42,139 INFO [AMShutdownThread] app.DAGAppMaster: Calling stop for all the services 2015-08-19 12:07:42,139 INFO [AMShutdownThread] rm.YarnTaskSchedulerService: Realease held containers 2015-08-19 12:07:42,139 INFO [Dispatcher thread: Central] history.HistoryEventHandler: [HISTORY][DAG:dag_1439946425329_0023_1][Event:CONTAINER_STOPPED]: containerId=container_1439946425329_0023_01_05, stoppedTime=1439957262139, exitStatus=0 2015-08-19 12:07:42,139 INFO [Dispatcher thread: Central] container.AMContainerImpl: AMContainer container_1439946425329_0023_01_05 transitioned from IDLE to STOP_REQUESTED via event C_STOP_REQUEST 2015-08-19 12:07:42,139 INFO [Dispatcher thread: Central] history.HistoryEventHandler: [HISTORY][DAG:dag_1439946425329_0023_1][Event:CONTAINER_STOPPED]:
[jira] [Updated] (TEZ-2687) ATS History shutdown happens before the min-held containers are released
[ https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated TEZ-2687: Attachment: TEZ-2687-1.patch ATS History shutdown happens before the min-held containers are released Key: TEZ-2687 URL: https://issues.apache.org/jira/browse/TEZ-2687 Project: Apache Tez Issue Type: Bug Affects Versions: 0.6.2, 0.8.0, 0.7.1 Reporter: Gopal V Assignee: Jeff Zhang Attachments: TEZ-2687-1.patch When ATS goes into a GC pause under heavy loads and while it recovers, each Tez AM holds onto a few containers even though it is shutting down and will never accept any more DAGs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2687) ATS History shutdown happens before the min-held containers are released
[ https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated TEZ-2687: Attachment: (was: TEZ-2687-1.patch) ATS History shutdown happens before the min-held containers are released Key: TEZ-2687 URL: https://issues.apache.org/jira/browse/TEZ-2687 Project: Apache Tez Issue Type: Bug Affects Versions: 0.6.2, 0.8.0, 0.7.1 Reporter: Gopal V Assignee: Jeff Zhang When ATS goes into a GC pause under heavy loads and while it recovers, each Tez AM holds onto a few containers even though it is shutting down and will never accept any more DAGs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: TEZ-2690 PreCommit Build #1004
Jira: https://issues.apache.org/jira/browse/TEZ-2690 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1004/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 25 lines...] == Testing patch for TEZ-2690. == == HEAD is now at 24ca1de TEZ-2730. tez-api missing dependency on org.codehaus.jettison for json. (hitesh) Previous HEAD position was 24ca1de... TEZ-2730. tez-api missing dependency on org.codehaus.jettison for json. (hitesh) Switched to branch 'master' Your branch is behind 'origin/master' by 1 commit, and can be fast-forwarded. (use git pull to update your local branch) First, rewinding head to replay your work on top of it... Fast-forwarded master to 24ca1de0e12da3f9d165f1eda44c7076de0f2f12. TEZ-2690 patch is being downloaded at Wed Aug 19 04:32:43 UTC 2015 from http://issues.apache.org/jira/secure/attachment/12751186/criticalPath.jpg patch: Only garbage was found in the patch input. patch: Only garbage was found in the patch input. patch: Only garbage was found in the patch input. The patch does not appear to apply with p0 to p2 PATCH APPLICATION FAILED {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12751186/criticalPath.jpg against master revision 24ca1de. {color:red}-1 patch{color}. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1004//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 635728cb2868e162e9dfd38f7f347fa64616de53 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Created] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json
Hitesh Shah created TEZ-2730: Summary: tez-api missing dependency on org.codehaus.jettison for json Key: TEZ-2730 URL: https://issues.apache.org/jira/browse/TEZ-2730 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2690: Comment: was deleted (was: Adds a task attempt level critical path analyzer. Uses the scheduling and data event dependencies to walk from the last attempt completion to first attempt creation to account for the time taken in the job. The output of the analyzer is an svg rendering of the critical path. Attached sample. The svg code has been re-written to generate svg directly instead of using jaxb because of missing features in jaxb (e.g. setting the value of a text field). Renames existing critical path analyzer to vertex level. Adds an AnalyzerDriver to allow running analyzers from the command line using hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the driver because I am not sure how the other analyzers would behave on the command line. They are written to output csv results. Perhaps we can create a base Csv analyzer that can take the csv results and output them on the console or write them to a file. Then they could be run on the command line. The goal is to get this in and have motivated developers start running it and finding issues/improvements. [~rajesh.balamohan] Please review. !criticalPath.jpg|thumbnail!) Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2690.1.patch, criticalPath.jpg Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702290#comment-14702290 ] Bikas Saha commented on TEZ-2690: - Adds a task attempt level critical path analyzer. Uses the scheduling and data event dependencies to walk from the last attempt completion to first attempt creation to account for the time taken in the job. The output of the analyzer is an svg rendering of the critical path. Attached sample. The svg code has been re-written to generate svg directly instead of using jaxb because of missing features in jaxb (e.g. setting the value of a text field). Renames existing critical path analyzer to vertex level. Adds an AnalyzerDriver to allow running analyzers from the command line using hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the driver because I am not sure how the other analyzers would behave on the command line. They are written to output csv results. Perhaps we can create a base Csv analyzer that can take the csv results and output them on the console or write them to a file. Then they could be run on the command line. The goal is to get this in and have motivated developers start running it and finding issues/improvements. [~rajesh.balamohan] Please review. !criticalPath.jpg! Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2690.1.patch, criticalPath.jpg Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json
[ https://issues.apache.org/jira/browse/TEZ-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702302#comment-14702302 ] TezQA commented on TEZ-2730: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12751128/TEZ-2730.1.patch against master revision 6cb8206. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1003//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1003//console This message is automatically generated. tez-api missing dependency on org.codehaus.jettison for json - Key: TEZ-2730 URL: https://issues.apache.org/jira/browse/TEZ-2730 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah Attachments: TEZ-2730.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: TEZ-2730 PreCommit Build #1003
Jira: https://issues.apache.org/jira/browse/TEZ-2730 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/1003/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 3289 lines...] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12751128/TEZ-2730.1.patch against master revision 6cb8206. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 3.0.1) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/1003//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1003//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 9e41068f4e8172aa1092f4923b9989c76998a72e logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts Sending artifact delta relative to PreCommit-TEZ-Build #1000 Archived 50 artifacts Archive block size is 32768 Received 2 blocks and 3001018 bytes Compression is 2.1% Took 0.96 sec [description-setter] Could not determine description. Recording test results Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702134#comment-14702134 ] Jonathan Eagles commented on TEZ-2300: -- Do we have consensus on an approach? TezClient:stop internally calls shutdownTezAM asynchronously by sending a DAG_KILL event DAGClient:tryKillDAG synchronously calls dispatcher handle event on the DAG_KILL event - which calls for all Vertexes, etc. TezClient.stop() takes a lot of time or does not work sometimes --- Key: TEZ-2300 URL: https://issues.apache.org/jira/browse/TEZ-2300 Project: Apache Tez Issue Type: Bug Reporter: Rohini Palaniswamy Assignee: Jonathan Eagles Attachments: TEZ-2300.1.patch, TEZ-2300.2.patch, TEZ-2300.3.patch, TEZ-2300.4.patch, syslog_dag_1428329756093_325099_1_post Noticed this with a couple of pig scripts which were not behaving well (AM close to OOM, etc) and even with some that were running fine. Pig calls Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits immediately or is hung. In both cases it either takes a long time for the yarn application to go to KILLED state. Many times I just end up calling yarn application -kill separately after waiting for 5 mins or more for it to get killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TEZ-2687) ATS History shutdown happens before the min-held containers are released
[ https://issues.apache.org/jira/browse/TEZ-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang reassigned TEZ-2687: --- Assignee: Jeff Zhang ATS History shutdown happens before the min-held containers are released Key: TEZ-2687 URL: https://issues.apache.org/jira/browse/TEZ-2687 Project: Apache Tez Issue Type: Bug Affects Versions: 0.6.2, 0.8.0, 0.7.1 Reporter: Gopal V Assignee: Jeff Zhang When ATS goes into a GC pause under heavy loads and while it recovers, each Tez AM holds onto a few containers even though it is shutting down and will never accept any more DAGs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2164) Shade the guava version used by Tez
[ https://issues.apache.org/jira/browse/TEZ-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702181#comment-14702181 ] Rajesh Balamohan commented on TEZ-2164: --- - Built (guava-tez-18.jar got added to $TEZ_HOME/lib/) and tested on multi-node cluster setup. And ran couple of jobs (hive workload). Works fine without issues. - Possibly can remove unintentional changes to ConcatenatedMergedKeyValueInput DAGEventStartDag, DAGRecoveredEvent, EdgeManagerForTest, HistoryACLPolicyException, InitialMemoryAllocator, KVDataGen, OnStateChangedCallback, MultiStageMRConfToTezTranslator, Output, StateMachineTez, SVGUtils, TaskEventScheduleTask, TestHistoryEventTimelineConversion, TestShuffleInputEventHandlerOrderedGrouped, TezBodyDeferringAsyncHandler - Imports need to be rearranged (e.g ExternalSorter, InputIntializerEvent, MROutput etC). or can be deferred for later Shade the guava version used by Tez --- Key: TEZ-2164 URL: https://issues.apache.org/jira/browse/TEZ-2164 Project: Apache Tez Issue Type: Improvement Reporter: Siddharth Seth Assignee: Hitesh Shah Priority: Critical Attachments: TEZ-2164.3.patch, TEZ-2164.wip.2.patch, allow-guava-16.0.1.patch Should allow us to upgrade to a newer version without shipping a guava dependency. Would be good to do this in 0.7 so that we stop shipping guava as early as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2690: Attachment: TEZ-2690.1.patch Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2690.1.patch Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2690: Attachment: TEZ-2690.1.patch Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2690.1.patch Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702282#comment-14702282 ] Bikas Saha commented on TEZ-2690: - Adds a task attempt level critical path analyzer. Uses the scheduling and data event dependencies to walk from the last attempt completion to first attempt creation to account for the time taken in the job. The output of the analyzer is an svg rendering of the critical path. Attached sample. The svg code has been re-written to generate svg directly instead of using jaxb because of missing features in jaxb (e.g. setting the value of a text field). Renames existing critical path analyzer to vertex level. Adds an AnalyzerDriver to allow running analyzers from the command line using hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the driver because I am not sure how the other analyzers would behave on the command line. They are written to output csv results. Perhaps we can create a base Csv analyzer that can take the csv results and output them on the console or write them to a file. Then they could be run on the command line. The goal is to get this in and have motivated developers start running it and finding issues/improvements. [~rajesh.balamohan] Please review. Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2690.1.patch, criticalPath.png Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2690: Attachment: criticalPath.png Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2690.1.patch, criticalPath.png Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702282#comment-14702282 ] Bikas Saha edited comment on TEZ-2690 at 8/19/15 1:39 AM: -- Adds a task attempt level critical path analyzer. Uses the scheduling and data event dependencies to walk from the last attempt completion to first attempt creation to account for the time taken in the job. The output of the analyzer is an svg rendering of the critical path. Attached sample. The svg code has been re-written to generate svg directly instead of using jaxb because of missing features in jaxb (e.g. setting the value of a text field). Renames existing critical path analyzer to vertex level. Adds an AnalyzerDriver to allow running analyzers from the command line using hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the driver because I am not sure how the other analyzers would behave on the command line. They are written to output csv results. Perhaps we can create a base Csv analyzer that can take the csv results and output them on the console or write them to a file. Then they could be run on the command line. The goal is to get this in and have motivated developers start running it and finding issues/improvements. [~rajesh.balamohan] Please review. !criticalPath.png|thumbnail! was (Author: bikassaha): Adds a task attempt level critical path analyzer. Uses the scheduling and data event dependencies to walk from the last attempt completion to first attempt creation to account for the time taken in the job. The output of the analyzer is an svg rendering of the critical path. Attached sample. The svg code has been re-written to generate svg directly instead of using jaxb because of missing features in jaxb (e.g. setting the value of a text field). Renames existing critical path analyzer to vertex level. Adds an AnalyzerDriver to allow running analyzers from the command line using hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the driver because I am not sure how the other analyzers would behave on the command line. They are written to output csv results. Perhaps we can create a base Csv analyzer that can take the csv results and output them on the console or write them to a file. Then they could be run on the command line. The goal is to get this in and have motivated developers start running it and finding issues/improvements. [~rajesh.balamohan] Please review. Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2690.1.patch, criticalPath.png Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2629) LimitExceededException in Tez client when DAG has exceeds the default max
[ https://issues.apache.org/jira/browse/TEZ-2629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702207#comment-14702207 ] Hitesh Shah commented on TEZ-2629: -- +1 LimitExceededException in Tez client when DAG has exceeds the default max - Key: TEZ-2629 URL: https://issues.apache.org/jira/browse/TEZ-2629 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.0 Reporter: Jason Dere Assignee: Siddharth Seth Attachments: TEZ-2629.1.txt Original issue was HIVE-11303, seeing LimitExceededException when the client tries to get the counters for a completed job: {noformat} 2015-07-17 18:18:11,830 INFO [main]: counters.Limits (Limits.java:ensureInitialized(59)) - Counter limits initialized with parameters: GROUP_NAME_MAX=256, MAX_GROUPS=500, COUNTER_NAME_MAX=64, MAX_COUNTERS=1200 2015-07-17 18:18:11,841 ERROR [main]: exec.Task (TezTask.java:execute(189)) - Failed to execute tez graph. org.apache.tez.common.counters.LimitExceededException: Too many counters: 1201 max=1200 at org.apache.tez.common.counters.Limits.checkCounters(Limits.java:87) at org.apache.tez.common.counters.Limits.incrCounters(Limits.java:94) at org.apache.tez.common.counters.AbstractCounterGroup.addCounter(AbstractCounterGroup.java:76) at org.apache.tez.common.counters.AbstractCounterGroup.addCounterImpl(AbstractCounterGroup.java:93) at org.apache.tez.common.counters.AbstractCounterGroup.findCounter(AbstractCounterGroup.java:104) at org.apache.tez.dag.api.DagTypeConverters.convertTezCountersFromProto(DagTypeConverters.java:567) at org.apache.tez.dag.api.client.DAGStatus.getDAGCounters(DAGStatus.java:148) at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:175) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1673) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1432) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1213) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1064) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1054) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:213) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:165) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:311) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:409) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:425) at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:714) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:681) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:621) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {noformat} It looks like Limits.ensureInitialized() is defaulting to an empty configuration, resulting in COUNTERS_MAX being set to the default of 1200 (even though Hive's configuration specified tez.counters.max=16000). Per [~sseth]: {quote} I think the Tez client does need to make this call to setup the Configuration correctly. We do this for the AM and the executing task - which is why it works. Could you please open a Tez jira for this ? Also, Limits is making use of Configuration instead of TezConfiguration for default initialization, which implies changes to tez-site on the local node won't be picked up. {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2300) TezClient.stop() takes a lot of time or does not work sometimes
[ https://issues.apache.org/jira/browse/TEZ-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702216#comment-14702216 ] Bikas Saha commented on TEZ-2300: - The current patch is useful because it ensures that the app is killed after some max deadline. In addition to that, if we want to ensure ATS is flushed by keeping the AM alive, we could, in shutdownTezAM 1) send release containers signal to the scheduler (this will reduce resource usage) 2) ensure DAG Kill is initiated (may already be happening but Rohini mentioned she saw allocations happen during this time) 3) call stop() to asynchronously stop (this includes flush to ATS) And return. Thoughts? TezClient.stop() takes a lot of time or does not work sometimes --- Key: TEZ-2300 URL: https://issues.apache.org/jira/browse/TEZ-2300 Project: Apache Tez Issue Type: Bug Reporter: Rohini Palaniswamy Assignee: Jonathan Eagles Attachments: TEZ-2300.1.patch, TEZ-2300.2.patch, TEZ-2300.3.patch, TEZ-2300.4.patch, syslog_dag_1428329756093_325099_1_post Noticed this with a couple of pig scripts which were not behaving well (AM close to OOM, etc) and even with some that were running fine. Pig calls Tezclient.stop() in shutdown hook. Ctrl+C to the pig script either exits immediately or is hung. In both cases it either takes a long time for the yarn application to go to KILLED state. Many times I just end up calling yarn application -kill separately after waiting for 5 mins or more for it to get killed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2690: Attachment: criticalPath.jpg Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2690.1.patch, criticalPath.jpg Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702282#comment-14702282 ] Bikas Saha edited comment on TEZ-2690 at 8/19/15 1:41 AM: -- Adds a task attempt level critical path analyzer. Uses the scheduling and data event dependencies to walk from the last attempt completion to first attempt creation to account for the time taken in the job. The output of the analyzer is an svg rendering of the critical path. Attached sample. The svg code has been re-written to generate svg directly instead of using jaxb because of missing features in jaxb (e.g. setting the value of a text field). Renames existing critical path analyzer to vertex level. Adds an AnalyzerDriver to allow running analyzers from the command line using hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the driver because I am not sure how the other analyzers would behave on the command line. They are written to output csv results. Perhaps we can create a base Csv analyzer that can take the csv results and output them on the console or write them to a file. Then they could be run on the command line. The goal is to get this in and have motivated developers start running it and finding issues/improvements. [~rajesh.balamohan] Please review. !criticalPath.jpg|thumbnail! was (Author: bikassaha): Adds a task attempt level critical path analyzer. Uses the scheduling and data event dependencies to walk from the last attempt completion to first attempt creation to account for the time taken in the job. The output of the analyzer is an svg rendering of the critical path. Attached sample. The svg code has been re-written to generate svg directly instead of using jaxb because of missing features in jaxb (e.g. setting the value of a text field). Renames existing critical path analyzer to vertex level. Adds an AnalyzerDriver to allow running analyzers from the command line using hadoop jar command. Only the latest CriticalPathAnalyzer has been added to the driver because I am not sure how the other analyzers would behave on the command line. They are written to output csv results. Perhaps we can create a base Csv analyzer that can take the csv results and output them on the console or write them to a file. Then they could be run on the command line. The goal is to get this in and have motivated developers start running it and finding issues/improvements. [~rajesh.balamohan] Please review. !criticalPath.png|thumbnail! Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2690.1.patch, criticalPath.jpg Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2690: Attachment: (was: TEZ-2690.1.patch) Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2690: Attachment: (was: criticalPath.png) Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2690.1.patch Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2690) Add critical path analyser
[ https://issues.apache.org/jira/browse/TEZ-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702537#comment-14702537 ] Rajesh Balamohan commented on TEZ-2690: --- lgtm overall. Minor comments. - TezAnalyzerBase -- Since the analyzer is not going to download the data, it might be good to comment related to DagId that needs to be downloaded. -- Is the main() function needed in base class? Or is it given mainly as an example? -- Since base already extends Configured, Analyzer.getConfiguration() should be removed. But this would be separate JIRA to let all analyzers extend TezAnalyzerBase. -- Printing usage might be useful (e.g need to refer to code for optional parameter outputDir) - Changes in VertexInfo is unintentional? - SVGUtils - It might break the earlier drawVertex(DagInfo). But can be added later as a part of refactoring other analyzers to extend TezAnalayzerBase. - CriticalPathAnalyzer -- getLastDataEventTime, getCreationTime etc got added as a part of TEZ-2701. So if we try to parse with older logs (e.g 0.8/0.7/0.6 etc), it might return 0 for currentAttempt.getLastDataEventTime(). -- Should (currentAttempt.getLastDataEventTime() 0) checks be added for such cases to fail fast if the logs do not have those details? Other calculations (e.g if (!Strings.isNullOrEmpty(currentAttempt.getLastDataEventSourceTA( can also become invalid. So it might be good to consider failing fast if the logs do not have the info that analyzer is looking for. Will go through the CriticalPathAnalyzer more in detail and post comments if any. Add critical path analyser -- Key: TEZ-2690 URL: https://issues.apache.org/jira/browse/TEZ-2690 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2690.1.patch, criticalPath.jpg Use input and scheduling dependencies to create critical path for a DAG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2725) Tez UI: Unit tests framework integration
[ https://issues.apache.org/jira/browse/TEZ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreenath Somarajapuram updated TEZ-2725: Description: - Investigate for the best UT framework for Tez UI, and integrate the same into the codebase. - UTs for each modules would be added as part of the respective patch. Tez UI: Unit tests framework integration Key: TEZ-2725 URL: https://issues.apache.org/jira/browse/TEZ-2725 Project: Apache Tez Issue Type: Bug Reporter: Sreenath Somarajapuram Assignee: Sreenath Somarajapuram - Investigate for the best UT framework for Tez UI, and integrate the same into the codebase. - UTs for each modules would be added as part of the respective patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2729) Standalone sample UI for running tez job analyzers
Rajesh Balamohan created TEZ-2729: - Summary: Standalone sample UI for running tez job analyzers Key: TEZ-2729 URL: https://issues.apache.org/jira/browse/TEZ-2729 Project: Apache Tez Issue Type: Wish Reporter: Rajesh Balamohan Assignee: Rajesh Balamohan -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2729) Standalone sample UI for running tez job analyzers
[ https://issues.apache.org/jira/browse/TEZ-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated TEZ-2729: -- Attachment: TEZ-2729.WIP.PoC.1.patch Attaching the very preliminary poc patch which uses Vaadin web framework for rendering analyzer results. User can download ATS data via ATSImportTool or from tez-ui. Downloaded zip file can be uploaded to this UI for analysis (i.e to run bunch of analyzers and render the results mostly in CSV format). Plz note that UI is just PoC/sample code. Installation instructions are provided in INSTALL.txt Standalone sample UI for running tez job analyzers -- Key: TEZ-2729 URL: https://issues.apache.org/jira/browse/TEZ-2729 Project: Apache Tez Issue Type: Wish Reporter: Rajesh Balamohan Assignee: Rajesh Balamohan Attachments: TEZ-2729.WIP.PoC.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2725) Tez UI: Unit tests framework integration
[ https://issues.apache.org/jira/browse/TEZ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sreenath Somarajapuram updated TEZ-2725: Summary: Tez UI: Unit tests framework integration (was: Tez UI: Unit tests) Tez UI: Unit tests framework integration Key: TEZ-2725 URL: https://issues.apache.org/jira/browse/TEZ-2725 Project: Apache Tez Issue Type: Bug Reporter: Sreenath Somarajapuram Assignee: Sreenath Somarajapuram -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2725) Tez UI: Unit tests framework integration
[ https://issues.apache.org/jira/browse/TEZ-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701328#comment-14701328 ] Sreenath Somarajapuram commented on TEZ-2725: - Have added the details. This ticket would be to just incorporate the UT framework. Tez UI: Unit tests framework integration Key: TEZ-2725 URL: https://issues.apache.org/jira/browse/TEZ-2725 Project: Apache Tez Issue Type: Bug Reporter: Sreenath Somarajapuram Assignee: Sreenath Somarajapuram - Investigate for the best UT framework for Tez UI, and integrate the same into the codebase. - UTs for each modules would be added as part of the respective patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2730) tez-api missing dependency on org.codehaus.jettison for json
[ https://issues.apache.org/jira/browse/TEZ-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702037#comment-14702037 ] Bikas Saha commented on TEZ-2730: - lgtm tez-api missing dependency on org.codehaus.jettison for json - Key: TEZ-2730 URL: https://issues.apache.org/jira/browse/TEZ-2730 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah Assignee: Hitesh Shah Attachments: TEZ-2730.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2164) Shade the guava version used by Tez
[ https://issues.apache.org/jira/browse/TEZ-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702036#comment-14702036 ] Hitesh Shah edited comment on TEZ-2164 at 8/18/15 9:42 PM: --- You will need to compile guava-tez first - it is not part of the top-level module list. I tried adding it but tez-dist hit some errors. bq. Does the shade plugin allow usage of the original package names in code, and have the shading done post compile ? I tried that approach but was not successful due to a set of reasons: - the relocation happens on jar creation - unit tests in other modules when referencing internal apis using guava breaks as they use normal guava packages and not the relocated ones - the only seamless way to do this is create a fat jar in tez-dist assembly with all guava relocated. However this still does not solve the case where tez wants to use a newer guava version. And yes, we will need to monitor each patch to see that com.google does not creep in - which is a possibility given that we cannot remove guava-11 as a compile time dependency ( caused by hadoop yarn using guava objects in its apis ) was (Author: hitesh): You will need to compile guava-tez first - it is not part of the top-level module list. bq. Does the shade plugin allow usage of the original package names in code, and have the shading done post compile ? I tried that approach but was not successful due to a set of reasons: - the relocation happens on jar creation - unit tests in other modules when referencing internal apis using guava breaks as they use normal guava packages and not the relocated ones - the only seamless way to do this is create a fat jar in tez-dist assembly with all guava relocated. However this still does not solve the case where tez wants to use a newer guava version. And yes, we will need to monitor each patch to see that com.google does not creep in - which is a possibility given that we cannot remove guava-11 as a compile time dependency ( caused by hadoop yarn using guava objects in its apis ) Shade the guava version used by Tez --- Key: TEZ-2164 URL: https://issues.apache.org/jira/browse/TEZ-2164 Project: Apache Tez Issue Type: Improvement Reporter: Siddharth Seth Assignee: Hitesh Shah Priority: Critical Attachments: TEZ-2164.3.patch, TEZ-2164.wip.2.patch, allow-guava-16.0.1.patch Should allow us to upgrade to a newer version without shipping a guava dependency. Would be good to do this in 0.7 so that we stop shipping guava as early as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2164) Shade the guava version used by Tez
[ https://issues.apache.org/jira/browse/TEZ-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14702036#comment-14702036 ] Hitesh Shah commented on TEZ-2164: -- You will need to compile guava-tez first - it is not part of the top-level module list. bq. Does the shade plugin allow usage of the original package names in code, and have the shading done post compile ? I tried that approach but was not successful due to a set of reasons: - the relocation happens on jar creation - unit tests in other modules when referencing internal apis using guava breaks as they use normal guava packages and not the relocated ones - the only seamless way to do this is create a fat jar in tez-dist assembly with all guava relocated. However this still does not solve the case where tez wants to use a newer guava version. And yes, we will need to monitor each patch to see that com.google does not creep in - which is a possibility given that we cannot remove guava-11 as a compile time dependency ( caused by hadoop yarn using guava objects in its apis ) Shade the guava version used by Tez --- Key: TEZ-2164 URL: https://issues.apache.org/jira/browse/TEZ-2164 Project: Apache Tez Issue Type: Improvement Reporter: Siddharth Seth Assignee: Hitesh Shah Priority: Critical Attachments: TEZ-2164.3.patch, TEZ-2164.wip.2.patch, allow-guava-16.0.1.patch Should allow us to upgrade to a newer version without shipping a guava dependency. Would be good to do this in 0.7 so that we stop shipping guava as early as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14701505#comment-14701505 ] Saikat commented on TEZ-2726: - [~rajesh.balamohan] [~bikassaha] There are no empty partitions in the example I mentioned. The source vertex has 1 task (used a UnorderedKVOutput, so produced only 1 partition)and sink vertex has 3 tasks. The edge is of type SCATTER-GATHER. When http fetchers sent a request for fetching the map outputs, the code in shufflehandler catches IOException in IndexCache.java getIndexInformation() function for the condition [info.mapSpillRecord.size() = reduce]. 2015-08-10 12:36:42,314 [New I/O worker #32] ERROR mapred.ShuffleHandler: Shuffle error in populating headers : java.io.IOException: Invalid request Map Id = attempt_1437478617943_17839_1_05_00_0_10003 Reducer = 1 Index Info Length = 1 at org.apache.hadoop.mapred.IndexCache.getIndexInformation(IndexCache.java:84) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.getMapOutputInfo(ShuffleHandler.java:855) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.populateHeaders(ShuffleHandler.java:875) at org.apache.hadoop.mapred.ShuffleHandler$Shuffle.messageReceived(ShuffleHandler.java:793) I ll try to get an excerpt of the Fetcher logs for DMEs and post here. Handle invalid number of partitions for SCATTER-GATHER edge --- Key: TEZ-2726 URL: https://issues.apache.org/jira/browse/TEZ-2726 Project: Apache Tez Issue Type: Improvement Reporter: Saikat Assignee: Saikat Encountered an issue where the source vertex has M task and sink vertex has N tasks (N M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. This resulted in sink vertex receiving DMEs with non existent targetIds. The fetchers for the sink vertex tasks then try to retrieve the map outputs and retrieve invalid headers due to exception in the ShuffleHandler. Possible fixes: 1. raise proper Tez Exception to indicate this invalid scenario. 2. or write appropriate empty partition bits, for the missing partitions before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-2726: - Affects Version/s: 0.7.1 Handle invalid number of partitions for SCATTER-GATHER edge --- Key: TEZ-2726 URL: https://issues.apache.org/jira/browse/TEZ-2726 Project: Apache Tez Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Saikat Assignee: Saikat Encountered an issue where the source vertex has M task and sink vertex has N tasks (N M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. This resulted in sink vertex receiving DMEs with non existent targetIds. The fetchers for the sink vertex tasks then try to retrieve the map outputs and retrieve invalid headers due to exception in the ShuffleHandler. Possible fixes: 1. raise proper Tez Exception to indicate this invalid scenario. 2. or write appropriate empty partition bits, for the missing partitions before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2726) Handle invalid number of partitions for SCATTER-GATHER edge
[ https://issues.apache.org/jira/browse/TEZ-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Eagles updated TEZ-2726: - Affects Version/s: (was: 0.7.1) 0.7.0 Handle invalid number of partitions for SCATTER-GATHER edge --- Key: TEZ-2726 URL: https://issues.apache.org/jira/browse/TEZ-2726 Project: Apache Tez Issue Type: Improvement Affects Versions: 0.7.0 Reporter: Saikat Assignee: Saikat Encountered an issue where the source vertex has M task and sink vertex has N tasks (N M), [e.g. M = 1, N = 3]and the edge is of type SCATTER -GATHER. This resulted in sink vertex receiving DMEs with non existent targetIds. The fetchers for the sink vertex tasks then try to retrieve the map outputs and retrieve invalid headers due to exception in the ShuffleHandler. Possible fixes: 1. raise proper Tez Exception to indicate this invalid scenario. 2. or write appropriate empty partition bits, for the missing partitions before sending out the DMEs to sink vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)