[jira] [Created] (TEZ-2350) TEZ-UI: Under vertex, rename Sources and Sinks to Configurations and then show configs for sources, sinks, processor and edges.
Sreenath Somarajapuram created TEZ-2350: --- Summary: TEZ-UI: Under vertex, rename Sources and Sinks to Configurations and then show configs for sources, sinks, processor and edges. Key: TEZ-2350 URL: https://issues.apache.org/jira/browse/TEZ-2350 Project: Apache Tez Issue Type: Improvement Reporter: Sreenath Somarajapuram -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows
[ https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504457#comment-14504457 ] Bikas Saha commented on TEZ-2341: - WindowsBasedProcessTree in Hadoop exists for Windows on Hadoop. So I dont think we can just ignore this in the test. TestMockDAGAppMaster.testBasicCounters fails on windows --- Key: TEZ-2341 URL: https://issues.apache.org/jira/browse/TEZ-2341 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Priority: Minor Attachments: TEZ-2341-1.patch {code} java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2344) TEZ-UI: Equip basic-ember-table's cell level loading for all use cases in all DAGs table
[ https://issues.apache.org/jira/browse/TEZ-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504411#comment-14504411 ] Prakash Ramachandran commented on TEZ-2344: --- LGTM +1, one minor question though bounded-cell view is not actually bounded (its value can be set delayed, but after that the change of property values are not reflected). are you planning to make it actually bound? TEZ-UI: Equip basic-ember-table's cell level loading for all use cases in all DAGs table Key: TEZ-2344 URL: https://issues.apache.org/jira/browse/TEZ-2344 Project: Apache Tez Issue Type: Sub-task Reporter: Sreenath Somarajapuram Assignee: Sreenath Somarajapuram Attachments: TEZ-2344.1.patch 1. Must handle promises, objects and primitive data types. 2. Must be generic 3. Display waiting animation or Not Availabe! messages when required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2346) TEZ-UI: Load other info / counter data on demand
[ https://issues.apache.org/jira/browse/TEZ-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504455#comment-14504455 ] Sreenath Somarajapuram commented on TEZ-2346: - Thanks [~zjshen] In the above scenario am trying to filer based on status and applicationId. As both of them are available in primaryfilters, am not sure why we are depended on otherinfo. Sorry if I have missed something. TEZ-UI: Load other info / counter data on demand Key: TEZ-2346 URL: https://issues.apache.org/jira/browse/TEZ-2346 Project: Apache Tez Issue Type: Sub-task Reporter: Sreenath Somarajapuram Assignee: Sreenath Somarajapuram Attachments: Screen-Shot-2015-04-21-at-1.56.28-AM.jpg, TEZ-2346.wip.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2345) TEZ-UI: Enable cell level loading in all DAGs table
[ https://issues.apache.org/jira/browse/TEZ-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505601#comment-14505601 ] TezQA commented on TEZ-2345: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726958/TEZ-2345.2.patch against master revision 87aac12. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/502//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/502//console This message is automatically generated. TEZ-UI: Enable cell level loading in all DAGs table --- Key: TEZ-2345 URL: https://issues.apache.org/jira/browse/TEZ-2345 Project: Apache Tez Issue Type: Sub-task Reporter: Sreenath Somarajapuram Assignee: Sreenath Somarajapuram Attachments: TEZ-2345.1.patch, TEZ-2345.2.patch - Enable cell level loading in all DAGs table using basic-ember-table component. - Re-arrange UI element into make it similar to other tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2340) TestRecoveryParser fails
[ https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated TEZ-2340: Attachment: TEZ-2340-1.patch TestRecoveryParser fails Key: TEZ-2340 URL: https://issues.apache.org/jira/browse/TEZ-2340 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Attachments: TEZ-2340-1.patch Stacktrace {code} java.io.IOException: Not supported at org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174) at org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365) at org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285) at org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138) {code} Standard Output {code} 2015-04-17 07:23:55,672 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]: it still exists. 2015-04-17 07:23:55,674 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]: it still exists. 2015-04-17 07:23:55,703 INFO [Thread-5] impl.TestDAGImpl (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan 2015-04-17 07:23:55,722 INFO [Thread-5] recovery.RecoveryService (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService 2015-04-17 07:23:55,723 INFO [Thread-5] recovery.RecoveryService (RecoveryService.java:serviceStart(127)) - Starting RecoveryService 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(314)) - Error handling summary event, eventType=DAG_SUBMITTED java.io.IOException: Not supported at org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174) at org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365) at org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285) at org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(318)) - Adding a flag to ensure next AM attempt does not start up, flagFile=target/org.apache.tez.dag.app.TestRecoveryParser-tmpDir/recovery/1/RecoveryFatalErrorOccurred 2015-04-17 07:23:55,725 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(323)) - Recovery failure occurred. Skipping all events 2015-04-17 07:23:55,756 ERROR [RecoveryEventHandlingThread] recovery.RecoveryService (RecoveryService.java:run(146)) - Recovery failure occurred. Stopping recovery thread. Current eventQueueSize=0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2340) TestRecoveryParser fails
[ https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504739#comment-14504739 ] Jeff Zhang commented on TEZ-2340: - The root cause of the test failure is that all the testcases use the directory for recovery so that the delete operation may fails because the last test case may not close the file stream. Attach the patch to use different recovery path for each test case TestRecoveryParser fails Key: TEZ-2340 URL: https://issues.apache.org/jira/browse/TEZ-2340 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Attachments: TEZ-2340-1.patch Stacktrace {code} java.io.IOException: Not supported at org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174) at org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365) at org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285) at org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138) {code} Standard Output {code} 2015-04-17 07:23:55,672 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]: it still exists. 2015-04-17 07:23:55,674 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]: it still exists. 2015-04-17 07:23:55,703 INFO [Thread-5] impl.TestDAGImpl (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan 2015-04-17 07:23:55,722 INFO [Thread-5] recovery.RecoveryService (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService 2015-04-17 07:23:55,723 INFO [Thread-5] recovery.RecoveryService (RecoveryService.java:serviceStart(127)) - Starting RecoveryService 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(314)) - Error handling summary event, eventType=DAG_SUBMITTED java.io.IOException: Not supported at org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174) at org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365) at org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285) at org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(318)) - Adding a flag to ensure next AM attempt does not start up, flagFile=target/org.apache.tez.dag.app.TestRecoveryParser-tmpDir/recovery/1/RecoveryFatalErrorOccurred 2015-04-17 07:23:55,725 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(323)) - Recovery failure occurred. Skipping all events 2015-04-17 07:23:55,756 ERROR [RecoveryEventHandlingThread] recovery.RecoveryService (RecoveryService.java:run(146)) - Recovery failure occurred. Stopping recovery thread. Current eventQueueSize=0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2340) TestRecoveryParser fails
[ https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated TEZ-2340: Summary: TestRecoveryParser fails (was: TestRecoveryParser fails on windows) TestRecoveryParser fails Key: TEZ-2340 URL: https://issues.apache.org/jira/browse/TEZ-2340 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Stacktrace {code} java.io.IOException: Not supported at org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174) at org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365) at org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285) at org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138) {code} Standard Output {code} 2015-04-17 07:23:55,672 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]: it still exists. 2015-04-17 07:23:55,674 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]: it still exists. 2015-04-17 07:23:55,703 INFO [Thread-5] impl.TestDAGImpl (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan 2015-04-17 07:23:55,722 INFO [Thread-5] recovery.RecoveryService (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService 2015-04-17 07:23:55,723 INFO [Thread-5] recovery.RecoveryService (RecoveryService.java:serviceStart(127)) - Starting RecoveryService 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(314)) - Error handling summary event, eventType=DAG_SUBMITTED java.io.IOException: Not supported at org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174) at org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365) at org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285) at org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(318)) - Adding a flag to ensure next AM attempt does not start up, flagFile=target/org.apache.tez.dag.app.TestRecoveryParser-tmpDir/recovery/1/RecoveryFatalErrorOccurred 2015-04-17 07:23:55,725 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(323)) - Recovery failure occurred. Skipping all events 2015-04-17 07:23:55,756 ERROR [RecoveryEventHandlingThread] recovery.RecoveryService (RecoveryService.java:run(146)) - Recovery failure occurred. Stopping recovery thread. Current eventQueueSize=0 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TEZ-2295) Allow to set vertex level info
[ https://issues.apache.org/jira/browse/TEZ-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang resolved TEZ-2295. - Resolution: Invalid Could add history text of vertex's processor for vertex level info Allow to set vertex level info -- Key: TEZ-2295 URL: https://issues.apache.org/jira/browse/TEZ-2295 Project: Apache Tez Issue Type: Improvement Reporter: Jeff Zhang Assignee: Jeff Zhang Also need to add doc here http://tez.apache.org/tez_ui_user_data.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2348) EOF exception during UnorderedKVReader.next()
[ https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated TEZ-2348: Attachment: _tez_session_dir.tgz Ran the query through the debugger and found a directory /tmp/hive/user/_tez_session_dir/, would that have the right files? I copied the contents right after hitting the error, attaching here. EOF exception during UnorderedKVReader.next() - Key: TEZ-2348 URL: https://issues.apache.org/jira/browse/TEZ-2348 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.2 Reporter: Jason Dere Attachments: _tez_session_dir.tgz {noformat} Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows
[ https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504507#comment-14504507 ] Jeff Zhang commented on TEZ-2341: - Thanks [~bikassaha], check the log again, it looks like an environment issue. winutils.exe is not installed {code} 2015-04-17 07:23:38,932 ERROR [IPC Server handler 0 on 55747] util.WindowsBasedProcessTree (WindowsBasedProcessTree.java:getAllProcessInfoFromShell(84)) - ExitCodeException exitCode=2: PrintTaskProcessList error (2): The system cannot find the file specified. TaskExit: error (2): The system cannot find the file specified. at org.apache.hadoop.util.Shell.runCommand(Shell.java:545) at org.apache.hadoop.util.Shell.run(Shell.java:456) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722) at org.apache.hadoop.yarn.util.WindowsBasedProcessTree.getAllProcessInfoFromShell(WindowsBasedProcessTree.java:81) at org.apache.hadoop.yarn.util.WindowsBasedProcessTree.updateProcessTree(WindowsBasedProcessTree.java:125) at org.apache.tez.dag.app.DAGAppMaster.getAMCPUTime(DAGAppMaster.java:347) at org.apache.tez.dag.app.DAGAppMaster.access$2800(DAGAppMaster.java:190) at org.apache.tez.dag.app.DAGAppMaster$RunningAppContext.getCumulativeCPUTime(DAGAppMaster.java:1428) at org.apache.tez.dag.app.dag.impl.DAGImpl.init(DAGImpl.java:527) at org.apache.tez.dag.app.DAGAppMaster.createDAG(DAGAppMaster.java:820) at org.apache.tez.dag.app.DAGAppMaster.createDAG(DAGAppMaster.java:798) at org.apache.tez.dag.app.DAGAppMaster.startDAG(DAGAppMaster.java:2030) at org.apache.tez.dag.app.DAGAppMaster.submitDAGToAppMaster(DAGAppMaster.java:1147) at org.apache.tez.dag.api.client.DAGClientHandler.submitDAG(DAGClientHandler.java:118) at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.submitDAG(DAGClientAMProtocolBlockingPBServerImpl.java:163) at org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:7471) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043) {code} TestMockDAGAppMaster.testBasicCounters fails on windows --- Key: TEZ-2341 URL: https://issues.apache.org/jira/browse/TEZ-2341 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Priority: Minor Attachments: TEZ-2341-1.patch {code} java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-1776) TA_CONTAINER_TERMINATING event should not always fail the task attempt
[ https://issues.apache.org/jira/browse/TEZ-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504610#comment-14504610 ] Siddharth Seth commented on TEZ-1776: - There's a separate transition for NODE_FAILURES, which should reach the TaskAttempt before CONTAINER_TERMINATING messages generated by the Container state machine. Those put TaskAttempts into a KILLED state. Are there other scenarios where you're seeing tasks failing when they should be killed. TA_CONTAINER_TERMINATING event should not always fail the task attempt -- Key: TEZ-1776 URL: https://issues.apache.org/jira/browse/TEZ-1776 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Priority: Blocker Fix For: 0.7.0 This is sometime sent when the node fails or other non-task related container failures. For those cases the attempt should transition to killed instead of failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2322) Succeeded count wrong for Pig on Tez job, decreased 380 = 181
[ https://issues.apache.org/jira/browse/TEZ-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504694#comment-14504694 ] Hari Sekhon commented on TEZ-2322: -- Hitesh Shah, the yarn logs command failed originally otherwise I would have supplied that output. Jeff Zhang I did note the job did succeed in the end - this is just a jira to mark that the counts were wrong, hence I've labelled this as minor priority to fix. Succeeded count wrong for Pig on Tez job, decreased 380 = 181 -- Key: TEZ-2322 URL: https://issues.apache.org/jira/browse/TEZ-2322 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.2 Environment: HDP 2.2 Reporter: Hari Sekhon Priority: Minor Attachments: attempt1_syslog_dag_1427546104095_0146_1, attempt2_syslog, attempt2_syslog_dag_1427546104095_0146_1, attempt2_syslog_dag_1427546104095_0146_1_post During a Pig on Tez job the number of succeeded tasks dropped from 380 = 181 as shown below: {code} 2015-04-15 15:09:56,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= 2015-04-15 15:10:16,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= 2015-04-15 15:10:36,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= 2015-04-15 15:10:56,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:11:16,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:11:36,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 182 Running: 723 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:11:56,993 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 184 Running: 721 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:12:16,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 186 Running: 719 Failed: 0 {code} Now this may be because the tasks failed, some certainly did due to space exceptions having checked the logs, but surely once a task has finished successfully and is marked as succeeded it cannot then later be removed from the succeeded count? Perhaps the succeeded counter is incremented too early before the task results are really saved? KilledTaskAttempts jumped from 16 = 89 at the same time, but even this doesn't account for the large drop in number of succeeded tasks. There was also a noticeable jump in Running tasks from 58 = 724 at the same time which is suspicious, I'm pretty sure there was no contending job to finish and release so much more resource to this Tez job, so it's also unclear how the running count count have jumped up to significantly given the cluster hardware resources have been the same throughout. Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2322) Succeeded count wrong for Pig on Tez job, decreased 380 = 181
[ https://issues.apache.org/jira/browse/TEZ-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504560#comment-14504560 ] Jeff Zhang commented on TEZ-2322: - [~harisekhon] Check the log and find that the dag is finally succeeded. The dag status when recovering may be incorrect, this will confuse users which we do need to improve that. Succeeded count wrong for Pig on Tez job, decreased 380 = 181 -- Key: TEZ-2322 URL: https://issues.apache.org/jira/browse/TEZ-2322 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.2 Environment: HDP 2.2 Reporter: Hari Sekhon Priority: Minor Attachments: attempt1_syslog_dag_1427546104095_0146_1, attempt2_syslog, attempt2_syslog_dag_1427546104095_0146_1, attempt2_syslog_dag_1427546104095_0146_1_post During a Pig on Tez job the number of succeeded tasks dropped from 380 = 181 as shown below: {code} 2015-04-15 15:09:56,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= 2015-04-15 15:10:16,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= 2015-04-15 15:10:36,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= 2015-04-15 15:10:56,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:11:16,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:11:36,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 182 Running: 723 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:11:56,993 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 184 Running: 721 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:12:16,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 186 Running: 719 Failed: 0 {code} Now this may be because the tasks failed, some certainly did due to space exceptions having checked the logs, but surely once a task has finished successfully and is marked as succeeded it cannot then later be removed from the succeeded count? Perhaps the succeeded counter is incremented too early before the task results are really saved? KilledTaskAttempts jumped from 16 = 89 at the same time, but even this doesn't account for the large drop in number of succeeded tasks. There was also a noticeable jump in Running tasks from 58 = 724 at the same time which is suspicious, I'm pretty sure there was no contending job to finish and release so much more resource to this Tez job, so it's also unclear how the running count count have jumped up to significantly given the cluster hardware resources have been the same throughout. Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2345) TEZ-UI: Enable cell level loading in all DAGs table
[ https://issues.apache.org/jira/browse/TEZ-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504837#comment-14504837 ] Prakash Ramachandran commented on TEZ-2345: --- [~sreenathmenon] few general ones * when no data is available , the table should not be hidden, and instead should be shown with no rows (ex. search for running dags when none are running). also ran into a couple of errors with the patch mostly looks like a race condition. * the below was caused by a store.find('appDetail', appId), and appId being null. this looks like a serialization issue bq. Uncaught Error: Assertion Failed: You may not pass `undefined` as id to the store's find methodember.js:3722 Ember.assertember-data.js:10457 Ember.Object.extend.findcombined-scripts.js:5613 getCellContent TEZ-UI: Enable cell level loading in all DAGs table --- Key: TEZ-2345 URL: https://issues.apache.org/jira/browse/TEZ-2345 Project: Apache Tez Issue Type: Sub-task Reporter: Sreenath Somarajapuram Assignee: Sreenath Somarajapuram Attachments: TEZ-2345.1.patch - Enable cell level loading in all DAGs table using basic-ember-table component. - Re-arrange UI element into make it similar to other tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2344) Tez UI: Equip basic-ember-table's cell level loading for all use cases in all DAGs table
[ https://issues.apache.org/jira/browse/TEZ-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prakash Ramachandran updated TEZ-2344: -- Summary: Tez UI: Equip basic-ember-table's cell level loading for all use cases in all DAGs table (was: TEZ-UI: Equip basic-ember-table's cell level loading for all use cases in all DAGs table) Tez UI: Equip basic-ember-table's cell level loading for all use cases in all DAGs table Key: TEZ-2344 URL: https://issues.apache.org/jira/browse/TEZ-2344 Project: Apache Tez Issue Type: Sub-task Reporter: Sreenath Somarajapuram Assignee: Sreenath Somarajapuram Attachments: TEZ-2344.1.patch 1. Must handle promises, objects and primitive data types. 2. Must be generic 3. Display waiting animation or Not Availabe! messages when required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2345) TEZ-UI: Enable cell level loading in all DAGs table
[ https://issues.apache.org/jira/browse/TEZ-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504837#comment-14504837 ] Prakash Ramachandran edited comment on TEZ-2345 at 4/21/15 11:59 AM: - [~Sreenath] few general ones * when no data is available , the table should not be hidden, and instead should be shown with no rows (ex. search for running dags when none are running). also ran into a couple of errors with the patch mostly looks like a race condition. * the below was caused by a store.find('appDetail', appId), and appId being null. this looks like a serialization issue bq. Uncaught Error: Assertion Failed: You may not pass `undefined` as id to the store's find methodember.js:3722 Ember.assertember-data.js:10457 Ember.Object.extend.findcombined-scripts.js:5613 getCellContent was (Author: pramachandran): [~sreenathmenon] few general ones * when no data is available , the table should not be hidden, and instead should be shown with no rows (ex. search for running dags when none are running). also ran into a couple of errors with the patch mostly looks like a race condition. * the below was caused by a store.find('appDetail', appId), and appId being null. this looks like a serialization issue bq. Uncaught Error: Assertion Failed: You may not pass `undefined` as id to the store's find methodember.js:3722 Ember.assertember-data.js:10457 Ember.Object.extend.findcombined-scripts.js:5613 getCellContent TEZ-UI: Enable cell level loading in all DAGs table --- Key: TEZ-2345 URL: https://issues.apache.org/jira/browse/TEZ-2345 Project: Apache Tez Issue Type: Sub-task Reporter: Sreenath Somarajapuram Assignee: Sreenath Somarajapuram Attachments: TEZ-2345.1.patch - Enable cell level loading in all DAGs table using basic-ember-table component. - Re-arrange UI element into make it similar to other tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TEZ-2325) Route status update event directly to the attempt
[ https://issues.apache.org/jira/browse/TEZ-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prakash Ramachandran reassigned TEZ-2325: - Assignee: Prakash Ramachandran Route status update event directly to the attempt -- Key: TEZ-2325 URL: https://issues.apache.org/jira/browse/TEZ-2325 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Assignee: Prakash Ramachandran Today, all events from the attempt heartbeat are routed to the vertex. then the vertex routes (if any) status update events to the attempt. This is unnecessary and potentially creates out of order scenarios. We could route the status update events directly to attempts. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2330) Create reconfigureVertex() API for input based initialization
[ https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2330: Attachment: (was: TEZ-2330.2.patch) Create reconfigureVertex() API for input based initialization -- Key: TEZ-2330 URL: https://issues.apache.org/jira/browse/TEZ-2330 Project: Apache Tez Issue Type: Task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2330.1.patch TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change parallelism of a vertex. Adding a variant to do the same for input initialization based parallelism change would allow us to deprecate the older overloaded setParallelism() API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows
[ https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505708#comment-14505708 ] Hitesh Shah edited comment on TEZ-2341 at 4/21/15 8:40 PM: --- The patch probably needs to change to check windows and not just win. The package dependencies change is non-trivial so better to change the test to ignore the check. was (Author: hitesh): The patch probably needs to change to check windows and not just win. TestMockDAGAppMaster.testBasicCounters fails on windows --- Key: TEZ-2341 URL: https://issues.apache.org/jira/browse/TEZ-2341 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Priority: Minor Attachments: TEZ-2341-1.patch {code} java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows
[ https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505708#comment-14505708 ] Hitesh Shah commented on TEZ-2341: -- The patch probably needs to change to check windows and not just win. TestMockDAGAppMaster.testBasicCounters fails on windows --- Key: TEZ-2341 URL: https://issues.apache.org/jira/browse/TEZ-2341 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Priority: Minor Attachments: TEZ-2341-1.patch {code} java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows
[ https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah reopened TEZ-2341: -- TestMockDAGAppMaster.testBasicCounters fails on windows --- Key: TEZ-2341 URL: https://issues.apache.org/jira/browse/TEZ-2341 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Priority: Minor Attachments: TEZ-2341-1.patch {code} java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows
[ https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505726#comment-14505726 ] Bikas Saha commented on TEZ-2341: - Perhaps just allow for Linux so we dont have to do this for every new OS. TestMockDAGAppMaster.testBasicCounters fails on windows --- Key: TEZ-2341 URL: https://issues.apache.org/jira/browse/TEZ-2341 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Priority: Minor Attachments: TEZ-2341-1.patch {code} java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2330) Create reconfigureVertex() API for input based initialization
[ https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505835#comment-14505835 ] TezQA commented on TEZ-2330: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726998/TEZ-2330.2.patch against master revision f46997a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/504//console This message is automatically generated. Create reconfigureVertex() API for input based initialization -- Key: TEZ-2330 URL: https://issues.apache.org/jira/browse/TEZ-2330 Project: Apache Tez Issue Type: Task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change parallelism of a vertex. Adding a variant to do the same for input initialization based parallelism change would allow us to deprecate the older overloaded setParallelism() API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2330) Create reconfigureVertex() API for input based initialization
[ https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2330: Attachment: TEZ-2330.2.patch Removing bad diff file. Attaching verified diff file. Create reconfigureVertex() API for input based initialization -- Key: TEZ-2330 URL: https://issues.apache.org/jira/browse/TEZ-2330 Project: Apache Tez Issue Type: Task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change parallelism of a vertex. Adding a variant to do the same for input initialization based parallelism change would allow us to deprecate the older overloaded setParallelism() API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class
[ https://issues.apache.org/jira/browse/TEZ-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505863#comment-14505863 ] Siddharth Seth commented on TEZ-2352: - [~bikassaha], [~rajesh.balamohan] - review please. Move getTaskStatistics into the RuntimeTask class - Key: TEZ-2352 URL: https://issues.apache.org/jira/browse/TEZ-2352 Project: Apache Tez Issue Type: Task Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: TEZ-2352.1.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2338) Tez job failed due to AM Container-Launch failure at windows
[ https://issues.apache.org/jira/browse/TEZ-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505662#comment-14505662 ] Bikas Saha commented on TEZ-2338: - Unfortunately, this is not the correct forum for this question as we are not the experts on Windows Hadoop installation issues. You can send a summary of your problem and your workaround/fix to u...@hadoop.apache.org and some windows experts there may be able to answer. Since this is not related to Tez, could you please close this jira? Thanks! Tez job failed due to AM Container-Launch failure at windows Key: TEZ-2338 URL: https://issues.apache.org/jira/browse/TEZ-2338 Project: Apache Tez Issue Type: Bug Affects Versions: 0.6.0 Environment: Windows server 2012 and Windows-8 Hadoop-2.5.2 Java-1.7 Reporter: Kaveen Raajan I successfully Build Tez-0.6.0 against Hadoop-2.5.2 Then I configured Tez-0.6.0 as like in http://tez.apache.org/install.html Moved Tez lib package to HDFS location and updated my tez-site.xml {code:xml} property nametez.lib.uris/name value${fs.default.name}/apps/Tez/,${fs.default.name}/apps/Tez/lib//value /property {code} After that I tried the sample test for tez _hadoop jar tez-examples-0.6.0.jar orderedwordcount input output_ But I face following error while running this command *Note:* I'm using HADOOP High Availability setup. {code} Running OrderedWordCount SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/C:/Hadoop/ share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBind er.class] SLF4J: Found binding in [jar:file:/C:/Tez/lib /slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 15/04/15 10:47:57 INFO client.TezClient: Tez Client Version: [ component=tez-api , version=0.6.0, revision=${buildNumber}, SCM-URL=scm:git:https://git-wip-us.apa che.org/repos/asf/tez.git, buildTime=2015-04-15T01:13:02Z ] 15/04/15 10:48:00 INFO client.TezClient: Submitting DAG application with id: app lication_1429073725727_0005 15/04/15 10:48:00 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS 15/04/15 10:48:00 INFO client.TezClientUtils: Using tez.lib.uris value from conf iguration: hdfs://HACluster/apps/Tez/,hdfs://HACluster/apps/Tez/lib/ 15/04/15 10:48:01 INFO client.TezClient: Stage directory /tmp/app/tez/sta ging doesn't exist and is created 15/04/15 10:48:01 INFO client.TezClient: Tez system stage directory hdfs://HACluster /tmp/app/tez/staging/.tez/application_1429073725727_0005 doesn't ex ist and is created 15/04/15 10:48:02 INFO client.TezClient: Submitting DAG to YARN, applicationId=a pplication_1429073725727_0005, dagName=OrderedWordCount 15/04/15 10:48:03 INFO impl.YarnClientImpl: Submitted application application_14 29073725727_0005 15/04/15 10:48:03 INFO client.TezClient: The url to track the Tez AM: http://MASTER_NN1:8088/proxy/application_1429073725727_0005/ 15/04/15 10:48:03 INFO client.DAGClientImpl: Waiting for DAG to start running 15/04/15 10:48:09 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED OrderedWordCount failed with diagnostics: [Application application_1429073725727 _0005 failed 2 times due to AM Container for appattempt_1429073725727_0005_0 2 exited with exitCode: -1073741515 due to: Exception from container-launch: Ex itCodeException exitCode=-1073741515: ExitCodeException exitCode=-1073741515: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java: 702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la unchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C ontainerLaunch.call(ContainerLaunch.java:300) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C ontainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor .java:615) at java.lang.Thread.run(Thread.java:744) 1 file(s) moved. Container exited with a non-zero exit code -1073741515 .Failing this attempt.. Failing the application.] {code} While Seeing at Resourcemanager log: {code} 2015-04-19 21:49:57,533 INFO
[jira] [Commented] (TEZ-2341) TestMockDAGAppMaster.testBasicCounters fails on windows
[ https://issues.apache.org/jira/browse/TEZ-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505655#comment-14505655 ] Bikas Saha commented on TEZ-2341: - Rethinking, its probably ok to ignore this in the test since we are running this in a simulation as part of the unit test and that will not have winutils deployed. So we cannot use winutils. Alternatively, we could configure TezMxBeanResourceCalculator as the resource calculator plugin class but probably the project dependencies will need to be tweaked for that. TestMockDAGAppMaster.testBasicCounters fails on windows --- Key: TEZ-2341 URL: https://issues.apache.org/jira/browse/TEZ-2341 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Priority: Minor Attachments: TEZ-2341-1.patch {code} java.lang.AssertionError: null at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.tez.dag.app.TestMockDAGAppMaster.testBasicCounters(TestMockDAGAppMaster.java:323) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2330) Create reconfigureVertex() API for input based initialization
[ https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2330: Attachment: TEZ-2330.2.patch Per discussion in TEZ-2292 removing TezException from the signature. Thanks for the review. Will wait for another clean Jenkins run before committing. Create reconfigureVertex() API for input based initialization -- Key: TEZ-2330 URL: https://issues.apache.org/jira/browse/TEZ-2330 Project: Apache Tez Issue Type: Task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change parallelism of a vertex. Adding a variant to do the same for input initialization based parallelism change would allow us to deprecate the older overloaded setParallelism() API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: TEZ-2330 PreCommit Build #504
Jira: https://issues.apache.org/jira/browse/TEZ-2330 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/504/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 90 lines...] == == Determining number of patched javac warnings. == == /home/jenkins/tools/maven/latest/bin/mvn clean test -DskipTests -Ptest-patch /home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/patchJavacWarnings.txt 21 {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726998/TEZ-2330.2.patch against master revision f46997a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/504//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 64ac4c08483002d810ed05b12d11d82f6c3d1def logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts Sending artifact delta relative to PreCommit-TEZ-Build #501 Archived 3 artifacts Archive block size is 32768 Received 0 blocks and 791005 bytes Compression is 0.0% Took 0.59 sec [description-setter] Could not determine description. Recording test results Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Created] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class
Siddharth Seth created TEZ-2352: --- Summary: Move getTaskStatistics into the RuntimeTask class Key: TEZ-2352 URL: https://issues.apache.org/jira/browse/TEZ-2352 Project: Apache Tez Issue Type: Task Reporter: Siddharth Seth Assignee: Siddharth Seth -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-1776) TA_CONTAINER_TERMINATING event should not always fail the task attempt
[ https://issues.apache.org/jira/browse/TEZ-1776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505680#comment-14505680 ] Bikas Saha commented on TEZ-1776: - IMO a test case will show whats really the sequence of events. If there is an issue we need a fix, if there isnt one then we need to fix the current transition logic for these events. TA_CONTAINER_TERMINATING event should not always fail the task attempt -- Key: TEZ-1776 URL: https://issues.apache.org/jira/browse/TEZ-1776 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Priority: Blocker Fix For: 0.7.0 This is sometime sent when the node fails or other non-task related container failures. For those cases the attempt should transition to killed instead of failed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Success: TEZ-2292 PreCommit Build #503
Jira: https://issues.apache.org/jira/browse/TEZ-2292 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/503/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 2774 lines...] [INFO] Final Memory: 70M/945M [INFO] {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726994/TEZ-2292.2.patch against master revision f46997a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/503//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/503//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. acef431a5ac778003f2d56428bf296a71b911af9 logged out == == Finished build. == == Archiving artifacts Sending artifact delta relative to PreCommit-TEZ-Build #501 Archived 44 artifacts Archive block size is 32768 Received 4 blocks and 2624892 bytes Compression is 4.8% Took 0.64 sec Description set: TEZ-2292 Recording test results Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-2292) Add e2e test for error reporting when vertex manager invokes plugin APIs
[ https://issues.apache.org/jira/browse/TEZ-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505900#comment-14505900 ] TezQA commented on TEZ-2292: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726994/TEZ-2292.2.patch against master revision f46997a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 3 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/503//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/503//console This message is automatically generated. Add e2e test for error reporting when vertex manager invokes plugin APIs Key: TEZ-2292 URL: https://issues.apache.org/jira/browse/TEZ-2292 Project: Apache Tez Issue Type: Task Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker Fix For: 0.7.0 Attachments: TEZ-2292.1.patch, TEZ-2292.2.patch If the Vertex Manager has an error or cannot apply a required reconfiguration then it should be allowed to fail the vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: TEZ-2330 PreCommit Build #505
Jira: https://issues.apache.org/jira/browse/TEZ-2330 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/505/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 2772 lines...] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12727003/TEZ-2330.2.patch against master revision f46997a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 175 javac compiler warnings (more than the master's current 174 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/505//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/505//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/505//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/505//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. b2149840de3a95d0490ab745f0e9a98e621eada0 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts Sending artifact delta relative to PreCommit-TEZ-Build #503 Archived 45 artifacts Archive block size is 32768 Received 4 blocks and 2633450 bytes Compression is 4.7% Took 1.4 sec [description-setter] Could not determine description. Recording test results Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class
[ https://issues.apache.org/jira/browse/TEZ-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505948#comment-14505948 ] TezQA commented on TEZ-2352: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12727004/TEZ-2352.1.txt against master revision f46997a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/506//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/506//console This message is automatically generated. Move getTaskStatistics into the RuntimeTask class - Key: TEZ-2352 URL: https://issues.apache.org/jira/browse/TEZ-2352 Project: Apache Tez Issue Type: Task Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: TEZ-2352.1.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: TEZ-2352 PreCommit Build #507
Jira: https://issues.apache.org/jira/browse/TEZ-2352 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/507/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 1232 lines...] Running tests /home/jenkins/tools/maven/latest/bin/mvn clean install -fn -DTezPatchProcess cat: /home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/testrun.txt: No such file or directory awk: cannot open /home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/../patchprocess/testrun.txt (No such file or directory) {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12727060/TEZ-2352.2.txt against master revision c6e400e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:red}-1 tests included{color}. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/507//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/507//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 04e6074a66f66be609643b903ad10ee55bfd4f24 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts [description-setter] Could not determine description. Recording test results Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-2327) NPE in shuffle
[ https://issues.apache.org/jira/browse/TEZ-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506326#comment-14506326 ] Hitesh Shah commented on TEZ-2327: -- What branch is this against? NPE in shuffle -- Key: TEZ-2327 URL: https://issues.apache.org/jira/browse/TEZ-2327 Project: Apache Tez Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Siddharth Seth {noformat} 2015-04-15 15:19:46,529 INFO [Dispatcher thread: Central] history.HistoryEventHandler: [HISTORY][DAG:dag_1428572510173_0219_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Reducer 2, taskAttemptId=attempt_1428572510173_0219_1_08_000872_0, startTime=1429136298733, finishTime=1429136386528, timeTaken=87795, status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while running task:java.lang.NullPointerException at sun.net.www.http.KeepAliveStream.close(KeepAliveStream.java:93) at java.io.FilterInputStream.close(FilterInputStream.java:181) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.close(HttpURLConnection.java:3395) at java.io.BufferedInputStream.close(BufferedInputStream.java:483) at java.io.FilterInputStream.close(FilterInputStream.java:181) at org.apache.tez.runtime.library.common.shuffle.HttpConnection.cleanup(HttpConnection.java:278) at org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:644) at org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:634) at org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdown(Fetcher.java:629) at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager.shutdown(ShuffleManager.java:759) at org.apache.tez.runtime.library.input.UnorderedKVInput.close(UnorderedKVInput.java:209) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:182) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This caused the task in question to fail -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2330) Create reconfigureVertex() API for input based initialization
[ https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506252#comment-14506252 ] TezQA commented on TEZ-2330: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12727080/TEZ-2330.3.patch against master revision c6e400e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 160 javac compiler warnings (more than the master's current 159 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/508//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/508//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/508//console This message is automatically generated. Create reconfigureVertex() API for input based initialization -- Key: TEZ-2330 URL: https://issues.apache.org/jira/browse/TEZ-2330 Project: Apache Tez Issue Type: Task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch, TEZ-2330.3.patch TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change parallelism of a vertex. Adding a variant to do the same for input initialization based parallelism change would allow us to deprecate the older overloaded setParallelism() API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Failed: TEZ-2330 PreCommit Build #508
Jira: https://issues.apache.org/jira/browse/TEZ-2330 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/508/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 2774 lines...] {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12727080/TEZ-2330.3.patch against master revision c6e400e. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 160 javac compiler warnings (more than the master's current 159 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/508//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/508//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/508//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 251ae3b565fdb2ef353d9ad85cf61ac9d982b847 logged out == == Finished build. == == Build step 'Execute shell' marked build as failure Archiving artifacts Sending artifact delta relative to PreCommit-TEZ-Build #503 Archived 45 artifacts Archive block size is 32768 Received 4 blocks and 2618264 bytes Compression is 4.8% Took 0.7 sec [description-setter] Could not determine description. Recording test results Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Commented] (TEZ-2330) Create reconfigureVertex() API for input based initialization
[ https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506261#comment-14506261 ] Bikas Saha commented on TEZ-2330: - The javac warning seems to be unrelated to this patch as it does not touch getTaskContainer 77a78 [WARNING] /home/jenkins/jenkins-slave/workspace/PreCommit-TEZ-Build/tez-dag/src/main/java/org/apache/tez/dag/app/dag/impl/VertexManager.java:[298,34] [deprecation] getTaskContainer(String,Integer) in VertexManagerPluginContext has been deprecated Committing in a bit. Create reconfigureVertex() API for input based initialization -- Key: TEZ-2330 URL: https://issues.apache.org/jira/browse/TEZ-2330 Project: Apache Tez Issue Type: Task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch, TEZ-2330.3.patch TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change parallelism of a vertex. Adding a variant to do the same for input initialization based parallelism change would allow us to deprecate the older overloaded setParallelism() API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (TEZ-2338) Tez job failed due to AM Container-Launch failure at windows
[ https://issues.apache.org/jira/browse/TEZ-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kaveen Raajan resolved TEZ-2338. Resolution: Done Fix we tried: Then we downloaded dll-file fixer download and reinstalled MSVCR100.dll file in NM machine. After that we tried mapreduce program for TEZ job got submitted and completed successfully and No ISSUE occured Tez job failed due to AM Container-Launch failure at windows Key: TEZ-2338 URL: https://issues.apache.org/jira/browse/TEZ-2338 Project: Apache Tez Issue Type: Bug Affects Versions: 0.6.0 Environment: Windows server 2012 and Windows-8 Hadoop-2.5.2 Java-1.7 Reporter: Kaveen Raajan I successfully Build Tez-0.6.0 against Hadoop-2.5.2 Then I configured Tez-0.6.0 as like in http://tez.apache.org/install.html Moved Tez lib package to HDFS location and updated my tez-site.xml {code:xml} property nametez.lib.uris/name value${fs.default.name}/apps/Tez/,${fs.default.name}/apps/Tez/lib//value /property {code} After that I tried the sample test for tez _hadoop jar tez-examples-0.6.0.jar orderedwordcount input output_ But I face following error while running this command *Note:* I'm using HADOOP High Availability setup. {code} Running OrderedWordCount SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/C:/Hadoop/ share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBind er.class] SLF4J: Found binding in [jar:file:/C:/Tez/lib /slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 15/04/15 10:47:57 INFO client.TezClient: Tez Client Version: [ component=tez-api , version=0.6.0, revision=${buildNumber}, SCM-URL=scm:git:https://git-wip-us.apa che.org/repos/asf/tez.git, buildTime=2015-04-15T01:13:02Z ] 15/04/15 10:48:00 INFO client.TezClient: Submitting DAG application with id: app lication_1429073725727_0005 15/04/15 10:48:00 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS 15/04/15 10:48:00 INFO client.TezClientUtils: Using tez.lib.uris value from conf iguration: hdfs://HACluster/apps/Tez/,hdfs://HACluster/apps/Tez/lib/ 15/04/15 10:48:01 INFO client.TezClient: Stage directory /tmp/app/tez/sta ging doesn't exist and is created 15/04/15 10:48:01 INFO client.TezClient: Tez system stage directory hdfs://HACluster /tmp/app/tez/staging/.tez/application_1429073725727_0005 doesn't ex ist and is created 15/04/15 10:48:02 INFO client.TezClient: Submitting DAG to YARN, applicationId=a pplication_1429073725727_0005, dagName=OrderedWordCount 15/04/15 10:48:03 INFO impl.YarnClientImpl: Submitted application application_14 29073725727_0005 15/04/15 10:48:03 INFO client.TezClient: The url to track the Tez AM: http://MASTER_NN1:8088/proxy/application_1429073725727_0005/ 15/04/15 10:48:03 INFO client.DAGClientImpl: Waiting for DAG to start running 15/04/15 10:48:09 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED OrderedWordCount failed with diagnostics: [Application application_1429073725727 _0005 failed 2 times due to AM Container for appattempt_1429073725727_0005_0 2 exited with exitCode: -1073741515 due to: Exception from container-launch: Ex itCodeException exitCode=-1073741515: ExitCodeException exitCode=-1073741515: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java: 702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la unchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C ontainerLaunch.call(ContainerLaunch.java:300) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.C ontainerLaunch.call(ContainerLaunch.java:81) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor. java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor .java:615) at java.lang.Thread.run(Thread.java:744) 1 file(s) moved. Container exited with a non-zero exit code -1073741515 .Failing this attempt.. Failing the application.] {code} While Seeing at Resourcemanager log: {code} 2015-04-19 21:49:57,533 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: completedContainer container=Container: [ContainerId: container_1429505171727_0001_02_01,
[jira] [Commented] (TEZ-2348) EOF exception during UnorderedKVReader.next()
[ https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506418#comment-14506418 ] Gopal V commented on TEZ-2348: -- [~rajesh.balamohan]: this needs a different exception with better messaging. With this particular fix, someone writing incorrect code during development would end up with an infinite loop {code} while (true) { reader.next(); ... if (key check) { break; } } {code} Some sort of exception on bad usage is far easier to debug than an infinite loop condition. EOF exception during UnorderedKVReader.next() - Key: TEZ-2348 URL: https://issues.apache.org/jira/browse/TEZ-2348 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.2 Reporter: Jason Dere Assignee: Rajesh Balamohan Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz {noformat} Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2348) EOF exception during UnorderedKVReader.next()
[ https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506419#comment-14506419 ] Rajesh Balamohan commented on TEZ-2348: --- Sure. Will add the checks in next() method (basically need to throw the exception when reader's next() is called after it returns false). Currently it is throwing from IFile which is somewhat misleading. EOF exception during UnorderedKVReader.next() - Key: TEZ-2348 URL: https://issues.apache.org/jira/browse/TEZ-2348 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.2 Reporter: Jason Dere Assignee: Rajesh Balamohan Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz {noformat} Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2348) EOF exception during UnorderedKVReader.next()
[ https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506426#comment-14506426 ] TezQA commented on TEZ-2348: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12727113/TEZ-2348.1.patch against master revision ec45c51. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/509//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/509//console This message is automatically generated. EOF exception during UnorderedKVReader.next() - Key: TEZ-2348 URL: https://issues.apache.org/jira/browse/TEZ-2348 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.2 Reporter: Jason Dere Assignee: Rajesh Balamohan Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz {noformat} Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2356) TEZ-2292 breaks VertexManagerPluginContext.reconfigureVertex api
Thejas M Nair created TEZ-2356: -- Summary: TEZ-2292 breaks VertexManagerPluginContext.reconfigureVertex api Key: TEZ-2356 URL: https://issues.apache.org/jira/browse/TEZ-2356 Project: Apache Tez Issue Type: Bug Affects Versions: 0.7.0 Reporter: Thejas M Nair Priority: Blocker This breaks pig compilation and needs urgent attention. {code} src/org/apache/pig/backend/hadoop/executionengine/tez/runtime/PigGraceShuffleVertexManager.java:173: error: exception TezException is never thrown in body of corresponding try statement [javac] } catch (TezException e) { [javac] ^ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
Success: TEZ-2348 PreCommit Build #509
Jira: https://issues.apache.org/jira/browse/TEZ-2348 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/509/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 2770 lines...] [INFO] Final Memory: 73M/982M [INFO] {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12727113/TEZ-2348.1.patch against master revision ec45c51. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/509//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/509//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 2a11b9a6392368667d5c747d76a531f63089e691 logged out == == Finished build. == == Archiving artifacts Sending artifact delta relative to PreCommit-TEZ-Build #503 Archived 44 artifacts Archive block size is 32768 Received 2 blocks and 2679005 bytes Compression is 2.4% Took 1.3 sec Description set: TEZ-2348 Recording test results Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Updated] (TEZ-2327) NPE in shuffle
[ https://issues.apache.org/jira/browse/TEZ-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hitesh Shah updated TEZ-2327: - Affects Version/s: TEZ-2003 NPE in shuffle -- Key: TEZ-2327 URL: https://issues.apache.org/jira/browse/TEZ-2327 Project: Apache Tez Issue Type: Bug Affects Versions: TEZ-2003 Reporter: Sergey Shelukhin Assignee: Siddharth Seth {noformat} 2015-04-15 15:19:46,529 INFO [Dispatcher thread: Central] history.HistoryEventHandler: [HISTORY][DAG:dag_1428572510173_0219_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Reducer 2, taskAttemptId=attempt_1428572510173_0219_1_08_000872_0, startTime=1429136298733, finishTime=1429136386528, timeTaken=87795, status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while running task:java.lang.NullPointerException at sun.net.www.http.KeepAliveStream.close(KeepAliveStream.java:93) at java.io.FilterInputStream.close(FilterInputStream.java:181) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.close(HttpURLConnection.java:3395) at java.io.BufferedInputStream.close(BufferedInputStream.java:483) at java.io.FilterInputStream.close(FilterInputStream.java:181) at org.apache.tez.runtime.library.common.shuffle.HttpConnection.cleanup(HttpConnection.java:278) at org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:644) at org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:634) at org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdown(Fetcher.java:629) at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager.shutdown(ShuffleManager.java:759) at org.apache.tez.runtime.library.input.UnorderedKVInput.close(UnorderedKVInput.java:209) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:182) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This caused the task in question to fail -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2248) VertexImpl/DAGImpl.checkForCompletion have too many termination cause checks
[ https://issues.apache.org/jira/browse/TEZ-2248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jeff Zhang updated TEZ-2248: Attachment: TEZ-2248-1.patch Attach the patch. [~bikassaha] Please help review it. VertexImpl/DAGImpl.checkForCompletion have too many termination cause checks Key: TEZ-2248 URL: https://issues.apache.org/jira/browse/TEZ-2248 Project: Apache Tez Issue Type: Bug Reporter: Bikas Saha Attachments: TEZ-2248-1.patch There is an if check for each termination cause which makes code long and we need to handle each new termination cause with more code. This could be abstracted into a method that gets termination cause string based on the enum and make this method shorter and stable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2348) EOF exception during UnorderedKVReader.next()
[ https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506407#comment-14506407 ] Hitesh Shah commented on TEZ-2348: -- [~rajesh.balamohan] Does the same issue hold for the other readers? EOF exception during UnorderedKVReader.next() - Key: TEZ-2348 URL: https://issues.apache.org/jira/browse/TEZ-2348 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.2 Reporter: Jason Dere Assignee: Rajesh Balamohan Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz {noformat} Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2356) TEZ-2292 breaks VertexManagerPluginContext.reconfigureVertex api
[ https://issues.apache.org/jira/browse/TEZ-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506455#comment-14506455 ] Hitesh Shah commented on TEZ-2356: -- [~thejas] Is pig compiling against 0.7.0-SNAPSHOT? The reconfigureVertex api was introduced in master just recently. \cc [~daijy] and [~bikassaha] who have been working on something related to this api to address autoparallelism issues for pig. TEZ-2292 breaks VertexManagerPluginContext.reconfigureVertex api Key: TEZ-2356 URL: https://issues.apache.org/jira/browse/TEZ-2356 Project: Apache Tez Issue Type: Bug Affects Versions: 0.7.0 Reporter: Thejas M Nair Priority: Blocker This breaks pig compilation and needs urgent attention. {code} src/org/apache/pig/backend/hadoop/executionengine/tez/runtime/PigGraceShuffleVertexManager.java:173: error: exception TezException is never thrown in body of corresponding try statement [javac] } catch (TezException e) { [javac] ^ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2348) EOF exception during UnorderedKVReader.next()
[ https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated TEZ-2348: -- Attachment: TEZ-2348.1.patch [~jdere] - The exception from IFile is valid as higher level API is not expected to call nextRawKey() when the end of file is reached. Can you please check if your Hive patch is calling UnorderedKVReader.next() even after it returns false?. In such situations, this error is possible. For instance, you can check for readers.UnorderedKVReader: Num Records read: in the task attempt log. This indicates that the UnoderedKVReader has finished processing and no more data is available. However, if higher level APIs invoke UnorderedKVReader.next() again, it would end up throwing EOF exception from IFile. Attaching the patch which handles this situation from tez side. [~sseth] - Can you please review? EOF exception during UnorderedKVReader.next() - Key: TEZ-2348 URL: https://issues.apache.org/jira/browse/TEZ-2348 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.2 Reporter: Jason Dere Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz {noformat} Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (TEZ-2348) EOF exception during UnorderedKVReader.next()
[ https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan reassigned TEZ-2348: - Assignee: Rajesh Balamohan EOF exception during UnorderedKVReader.next() - Key: TEZ-2348 URL: https://issues.apache.org/jira/browse/TEZ-2348 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.2 Reporter: Jason Dere Assignee: Rajesh Balamohan Attachments: TEZ-2348.1.patch, _tez_session_dir.tgz {noformat} Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2327) NPE in shuffle
[ https://issues.apache.org/jira/browse/TEZ-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506448#comment-14506448 ] Sergey Shelukhin commented on TEZ-2327: --- LLAP NPE in shuffle -- Key: TEZ-2327 URL: https://issues.apache.org/jira/browse/TEZ-2327 Project: Apache Tez Issue Type: Bug Reporter: Sergey Shelukhin Assignee: Siddharth Seth {noformat} 2015-04-15 15:19:46,529 INFO [Dispatcher thread: Central] history.HistoryEventHandler: [HISTORY][DAG:dag_1428572510173_0219_1][Event:TASK_ATTEMPT_FINISHED]: vertexName=Reducer 2, taskAttemptId=attempt_1428572510173_0219_1_08_000872_0, startTime=1429136298733, finishTime=1429136386528, timeTaken=87795, status=FAILED, errorEnum=FRAMEWORK_ERROR, diagnostics=Error: Failure while running task:java.lang.NullPointerException at sun.net.www.http.KeepAliveStream.close(KeepAliveStream.java:93) at java.io.FilterInputStream.close(FilterInputStream.java:181) at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.close(HttpURLConnection.java:3395) at java.io.BufferedInputStream.close(BufferedInputStream.java:483) at java.io.FilterInputStream.close(FilterInputStream.java:181) at org.apache.tez.runtime.library.common.shuffle.HttpConnection.cleanup(HttpConnection.java:278) at org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:644) at org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdownInternal(Fetcher.java:634) at org.apache.tez.runtime.library.common.shuffle.Fetcher.shutdown(Fetcher.java:629) at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleManager.shutdown(ShuffleManager.java:759) at org.apache.tez.runtime.library.input.UnorderedKVInput.close(UnorderedKVInput.java:209) at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.close(LogicalIOProcessorRuntimeTask.java:347) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:182) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:172) at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:168) at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} This caused the task in question to fail -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2348) EOF exception during UnorderedKVReader.next()
[ https://issues.apache.org/jira/browse/TEZ-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rajesh Balamohan updated TEZ-2348: -- Attachment: TEZ-2348.2.patch Attaching the patch, which would throw IOException when reader.next() is called multiple times (i.e, even after when it returns false). Question: However, other readers (e.g MRReaderMapReduce) return false as well when multiple invocations are made. So theoretically, they can as well get into tight loop with the example code you posted. Since the example usage is given in KeyValueReader/KeyValuesReader, is it safe to assume that people would not write infinite loop code and check for the return value?. [~hitesh] - Other readers return false when multiple invocations are made. However, in the case of UnorderKVReader, it was getting into the IFile path due to a stale reference in currentReader. The first patch removed the stale reference and was returning false (like other readers). EOF exception during UnorderedKVReader.next() - Key: TEZ-2348 URL: https://issues.apache.org/jira/browse/TEZ-2348 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.2 Reporter: Jason Dere Assignee: Rajesh Balamohan Attachments: TEZ-2348.1.patch, TEZ-2348.2.patch, _tez_session_dir.tgz {noformat} Caused by: java.lang.RuntimeException: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:278) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordProcessor.run(ReduceRecordProcessor.java:184) at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) ... 13 more Caused by: java.io.IOException: Reached EOF. Completed reading 516605 at org.apache.tez.runtime.library.common.sort.impl.IFile.checkState(IFile.java:817) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.positionToNextRecord(IFile.java:698) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readRawKey(IFile.java:731) at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.nextRawKey(IFile.java:727) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.readNextFromCurrentReader(UnorderedKVReader.java:151) at org.apache.tez.runtime.library.common.readers.UnorderedKVReader.next(UnorderedKVReader.java:112) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource$KeyValuesFromKeyValue.next(ReduceRecordSource.java:439) at org.apache.hadoop.hive.ql.exec.tez.ReduceRecordSource.pushRecord(ReduceRecordSource.java:232) ... 15 more {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (TEZ-2330) Create reconfigureVertex() API for input based initialization
[ https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2330: Attachment: TEZ-2330.3.patch Rebasing after recent commits. Create reconfigureVertex() API for input based initialization -- Key: TEZ-2330 URL: https://issues.apache.org/jira/browse/TEZ-2330 Project: Apache Tez Issue Type: Task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch, TEZ-2330.3.patch TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change parallelism of a vertex. Adding a variant to do the same for input initialization based parallelism change would allow us to deprecate the older overloaded setParallelism() API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (TEZ-2355) TezRuntimeConfiguration inconsistencies in field names
Hitesh Shah created TEZ-2355: Summary: TezRuntimeConfiguration inconsistencies in field names Key: TEZ-2355 URL: https://issues.apache.org/jira/browse/TEZ-2355 Project: Apache Tez Issue Type: Bug Reporter: Hitesh Shah Assignee: Rajesh Balamohan TEZ_RUNTIME_INPUT_BUFFER_PERCENT_DEFAULT compared to TEZ_RUNTIME_INPUT_POST_MERGE_BUFFER_PERCENT TEZ_RUNTIME_SHUFFLE_STALLED_COPY_TIMEOUT_DEFAULT compared to TEZ_RUNTIME_SHUFFLE_CONNECT_TIMEOUT Given that this is a public api, we will need to deprecate the inconsistent names and not remove them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2330) Create reconfigureVertex() API for input based initialization
[ https://issues.apache.org/jira/browse/TEZ-2330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505935#comment-14505935 ] TezQA commented on TEZ-2330: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12727003/TEZ-2330.2.patch against master revision f46997a. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 175 javac compiler warnings (more than the master's current 174 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/505//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/505//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html Javac warnings: https://builds.apache.org/job/PreCommit-TEZ-Build/505//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/505//console This message is automatically generated. Create reconfigureVertex() API for input based initialization -- Key: TEZ-2330 URL: https://issues.apache.org/jira/browse/TEZ-2330 Project: Apache Tez Issue Type: Task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2330.1.patch, TEZ-2330.2.patch TEZ-2233 added a reconfigureVertex() to enable a cleaner API to change parallelism of a vertex. Adding a variant to do the same for input initialization based parallelism change would allow us to deprecate the older overloaded setParallelism() API. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2352) Move getTaskStatistics into the RuntimeTask class
[ https://issues.apache.org/jira/browse/TEZ-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14506177#comment-14506177 ] Bikas Saha commented on TEZ-2352: - lgtm. there may be an unused import for taskstatistics in lioruntimetask. Move getTaskStatistics into the RuntimeTask class - Key: TEZ-2352 URL: https://issues.apache.org/jira/browse/TEZ-2352 Project: Apache Tez Issue Type: Task Reporter: Siddharth Seth Assignee: Siddharth Seth Attachments: TEZ-2352.1.txt, TEZ-2352.2.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2308) Add set/get of record counts in task/vertex statistics
[ https://issues.apache.org/jira/browse/TEZ-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504741#comment-14504741 ] Rajesh Balamohan commented on TEZ-2308: --- lgtm. +1. Add set/get of record counts in task/vertex statistics -- Key: TEZ-2308 URL: https://issues.apache.org/jira/browse/TEZ-2308 Project: Apache Tez Issue Type: Sub-task Reporter: Bikas Saha Assignee: Bikas Saha Attachments: TEZ-2308.1.patch In addition to data size, getting record count would be useful. /cc [~rohini] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2338) Tez job failed due to AM Container-Launch failure at windows
[ https://issues.apache.org/jira/browse/TEZ-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504776#comment-14504776 ] Kaveen Raajan commented on TEZ-2338: Hi [~hitesh] Thanks for the update :), we tried by adding this in our yarn-site.xml. {code:xml} property nameyarn.nodemanager.delete.debug-delay-sec/name value1200/value /property {code} We noticed one thing while running the *launch-container.cmd* located in hadoop _\tmp\..\appcache_ location. It arises an issue in accessing the *.dll* for running mapreduce on windows platform, ie. MSVCR100.dll message box was thrown while handling TEZ job. *Error Message:* {quote}The program can't start because MSCVR100.dll is missing from your computer. Try reinstalling the program to fix this issue{quote} But we installed framework-4.5 in that NM node, and we also find MSVCR100.dll at C:\Windows\System32\ location. Even though we face same issue. *Fix we tried:* Then we downloaded dll-file fixer [download|http://download.dll-files.com/fixer/filest/dff_fdp2-msvcr100.exe] and reinstalled MSVCR100.dll file in NM machine. After that we tried mapreduce program for TEZ job got submitted and completed successfully and No ISSUE occured Is this a proper fix for the above Exception and what the reason for this Exception? Tez job failed due to AM Container-Launch failure at windows Key: TEZ-2338 URL: https://issues.apache.org/jira/browse/TEZ-2338 Project: Apache Tez Issue Type: Bug Affects Versions: 0.6.0 Environment: Windows server 2012 and Windows-8 Hadoop-2.5.2 Java-1.7 Reporter: Kaveen Raajan I successfully Build Tez-0.6.0 against Hadoop-2.5.2 Then I configured Tez-0.6.0 as like in http://tez.apache.org/install.html Moved Tez lib package to HDFS location and updated my tez-site.xml {code:xml} property nametez.lib.uris/name value${fs.default.name}/apps/Tez/,${fs.default.name}/apps/Tez/lib//value /property {code} After that I tried the sample test for tez _hadoop jar tez-examples-0.6.0.jar orderedwordcount input output_ But I face following error while running this command *Note:* I'm using HADOOP High Availability setup. {code} Running OrderedWordCount SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/C:/Hadoop/ share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBind er.class] SLF4J: Found binding in [jar:file:/C:/Tez/lib /slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 15/04/15 10:47:57 INFO client.TezClient: Tez Client Version: [ component=tez-api , version=0.6.0, revision=${buildNumber}, SCM-URL=scm:git:https://git-wip-us.apa che.org/repos/asf/tez.git, buildTime=2015-04-15T01:13:02Z ] 15/04/15 10:48:00 INFO client.TezClient: Submitting DAG application with id: app lication_1429073725727_0005 15/04/15 10:48:00 INFO Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS 15/04/15 10:48:00 INFO client.TezClientUtils: Using tez.lib.uris value from conf iguration: hdfs://HACluster/apps/Tez/,hdfs://HACluster/apps/Tez/lib/ 15/04/15 10:48:01 INFO client.TezClient: Stage directory /tmp/app/tez/sta ging doesn't exist and is created 15/04/15 10:48:01 INFO client.TezClient: Tez system stage directory hdfs://HACluster /tmp/app/tez/staging/.tez/application_1429073725727_0005 doesn't ex ist and is created 15/04/15 10:48:02 INFO client.TezClient: Submitting DAG to YARN, applicationId=a pplication_1429073725727_0005, dagName=OrderedWordCount 15/04/15 10:48:03 INFO impl.YarnClientImpl: Submitted application application_14 29073725727_0005 15/04/15 10:48:03 INFO client.TezClient: The url to track the Tez AM: http://MASTER_NN1:8088/proxy/application_1429073725727_0005/ 15/04/15 10:48:03 INFO client.DAGClientImpl: Waiting for DAG to start running 15/04/15 10:48:09 INFO client.DAGClientImpl: DAG completed. FinalState=FAILED OrderedWordCount failed with diagnostics: [Application application_1429073725727 _0005 failed 2 times due to AM Container for appattempt_1429073725727_0005_0 2 exited with exitCode: -1073741515 due to: Exception from container-launch: Ex itCodeException exitCode=-1073741515: ExitCodeException exitCode=-1073741515: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java: 702) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.la unchContainer(DefaultContainerExecutor.java:195)
[jira] [Commented] (TEZ-2346) TEZ-UI: Load other info / counter data on demand
[ https://issues.apache.org/jira/browse/TEZ-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504809#comment-14504809 ] Prakash Ramachandran commented on TEZ-2346: --- [~Sreenath] bq. In the above scenario am trying to filer based on status and applicationId. As both of them are available in primaryfilters, am not sure why we are depended on otherinfo. ats allows only one primaryfilter. so in tez ui, if more than one filter is specified as primary it is moved to secondaryfilter (status, dagname etc are present in both and timeline checks in both otherinfo and primaryfilter for the same - see getFilterProperties in paginated_content.js for the setting of filters from UI). ats first filters by primary and then by secondary. also the status is updated in primaryFilters only after the dag finishes. regarding the exception - I believe the following code causes the issue (since no otherinfo is specified in the fields entity.getOtherInfo will be null and the get will cause NPE). {code:title=LeveldbTimelineStore.java} if (fields.contains(Field.OTHER_INFO)) { otherInfo = true; } else { entity.setOtherInfo(null); } ... ... public void setOtherInfo(MapString, Object otherInfo) { if (otherInfo != null !(otherInfo instanceof HashMap)) { this.otherInfo = new HashMapString, Object(otherInfo); } else { this.otherInfo = (HashMapString, Object) otherInfo; } } {code} {code:title=LeveldbTimelineStore.java} if (secondaryFilters != null) { for (NameValuePair filter : secondaryFilters) { Object v = entity.getOtherInfo().get(filter.getName()); {code} TEZ-UI: Load other info / counter data on demand Key: TEZ-2346 URL: https://issues.apache.org/jira/browse/TEZ-2346 Project: Apache Tez Issue Type: Sub-task Reporter: Sreenath Somarajapuram Assignee: Sreenath Somarajapuram Attachments: Screen-Shot-2015-04-21-at-1.56.28-AM.jpg, TEZ-2346.wip.1.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2344) TEZ-UI: Equip basic-ember-table's cell level loading for all use cases in all DAGs table
[ https://issues.apache.org/jira/browse/TEZ-2344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504729#comment-14504729 ] Sreenath Somarajapuram commented on TEZ-2344: - All templates except basic-cell supports bounded values and are capable of displaying dynamic data. Its just that the getCellContent function must be equipped to delegate changes. TEZ-UI: Equip basic-ember-table's cell level loading for all use cases in all DAGs table Key: TEZ-2344 URL: https://issues.apache.org/jira/browse/TEZ-2344 Project: Apache Tez Issue Type: Sub-task Reporter: Sreenath Somarajapuram Assignee: Sreenath Somarajapuram Attachments: TEZ-2344.1.patch 1. Must handle promises, objects and primitive data types. 2. Must be generic 3. Display waiting animation or Not Availabe! messages when required. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (TEZ-2340) TestRecoveryParser fails
[ https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504739#comment-14504739 ] Jeff Zhang edited comment on TEZ-2340 at 4/21/15 10:34 AM: --- The root cause of the test failure is that all the testcases use the same directory for recovery so that the delete operation may fails because the last test case may not close the file stream. Attach the patch to use different recovery path for each test case. {code} 2015-04-17 07:23:55,672 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]: it still exists. 2015-04-17 07:23:55,674 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]: it still exists. {code} was (Author: zjffdu): The root cause of the test failure is that all the testcases use the directory for recovery so that the delete operation may fails because the last test case may not close the file stream. Attach the patch to use different recovery path for each test case. {code} 2015-04-17 07:23:55,672 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]: it still exists. 2015-04-17 07:23:55,674 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]: it still exists. {code} TestRecoveryParser fails Key: TEZ-2340 URL: https://issues.apache.org/jira/browse/TEZ-2340 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Attachments: TEZ-2340-1.patch Stacktrace {code} java.io.IOException: Not supported at org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174) at org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365) at org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285) at org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138) {code} Standard Output {code} 2015-04-17 07:23:55,672 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]: it still exists. 2015-04-17 07:23:55,674 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]: it still exists. 2015-04-17 07:23:55,703 INFO [Thread-5] impl.TestDAGImpl (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan 2015-04-17 07:23:55,722 INFO [Thread-5] recovery.RecoveryService (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService 2015-04-17 07:23:55,723 INFO [Thread-5] recovery.RecoveryService (RecoveryService.java:serviceStart(127)) - Starting RecoveryService 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(314)) - Error handling summary event, eventType=DAG_SUBMITTED java.io.IOException: Not supported at org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174) at org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365) at org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285) at org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at
[jira] [Commented] (TEZ-2340) TestRecoveryParser fails
[ https://issues.apache.org/jira/browse/TEZ-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14504786#comment-14504786 ] TezQA commented on TEZ-2340: {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726829/TEZ-2340-1.patch against master revision decb419. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/501//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/501//console This message is automatically generated. TestRecoveryParser fails Key: TEZ-2340 URL: https://issues.apache.org/jira/browse/TEZ-2340 Project: Apache Tez Issue Type: Bug Reporter: Jeff Zhang Assignee: Jeff Zhang Attachments: TEZ-2340-1.patch Stacktrace {code} java.io.IOException: Not supported at org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174) at org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365) at org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285) at org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138) {code} Standard Output {code} 2015-04-17 07:23:55,672 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\.summary.crc]: it still exists. 2015-04-17 07:23:55,674 WARN [main] fs.FileUtil (FileUtil.java:deleteImpl(187)) - Failed to delete file or dir [D:\w\tez\tez-dag\target\org.apache.tez.dag.app.TestRecoveryParser-tmpDir\recovery\1\summary]: it still exists. 2015-04-17 07:23:55,703 INFO [Thread-5] impl.TestDAGImpl (TestDAGImpl.java:createTestDAGPlan(446)) - Setting up dag plan 2015-04-17 07:23:55,722 INFO [Thread-5] recovery.RecoveryService (RecoveryService.java:serviceInit(109)) - Initializing RecoveryService 2015-04-17 07:23:55,723 INFO [Thread-5] recovery.RecoveryService (RecoveryService.java:serviceStart(127)) - Starting RecoveryService 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(314)) - Error handling summary event, eventType=DAG_SUBMITTED java.io.IOException: Not supported at org.apache.hadoop.fs.ChecksumFileSystem.append(ChecksumFileSystem.java:352) at org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1174) at org.apache.tez.dag.history.recovery.RecoveryService.handleSummaryEvent(RecoveryService.java:365) at org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:285) at org.apache.tez.dag.app.TestRecoveryParser.testSkipAllOtherEvents_1(TestRecoveryParser.java:138) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) 2015-04-17 07:23:55,724 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(318)) - Adding a flag to ensure next AM attempt does not start up, flagFile=target/org.apache.tez.dag.app.TestRecoveryParser-tmpDir/recovery/1/RecoveryFatalErrorOccurred 2015-04-17 07:23:55,725 ERROR [Thread-5] recovery.RecoveryService (RecoveryService.java:handle(323)) - Recovery failure occurred. Skipping all events
Success: TEZ-2340 PreCommit Build #501
Jira: https://issues.apache.org/jira/browse/TEZ-2340 Build: https://builds.apache.org/job/PreCommit-TEZ-Build/501/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 2765 lines...] [INFO] Final Memory: 70M/948M [INFO] {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12726829/TEZ-2340-1.patch against master revision decb419. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-TEZ-Build/501//testReport/ Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/501//console This message is automatically generated. == == Adding comment to Jira. == == Comment added. 4d7861aaf92c41cac0d6379b502fa4770e4c5275 logged out == == Finished build. == == Archiving artifacts Sending artifact delta relative to PreCommit-TEZ-Build #499 Archived 44 artifacts Archive block size is 32768 Received 26 blocks and 1907931 bytes Compression is 30.9% Took 1.3 sec Description set: TEZ-2340 Recording test results Email was triggered for: Success Sending email for trigger: Success ### ## FAILED TESTS (if any) ## All tests passed
[jira] [Updated] (TEZ-2292) Add e2e test for error reporting when vertex manager invokes plugin APIs
[ https://issues.apache.org/jira/browse/TEZ-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bikas Saha updated TEZ-2292: Attachment: TEZ-2292.2.patch Attaching patch that fixes the review comments and also removes TezException from the signature. For now that seems like the prudent thing to do given that reconfiguration failure is almost never an optional event. Later when we have/support cases where reconfiguration failure is ok then we can create a maybeReconfigureVertex() that allows for it - per offline discussion with [~hitesh]. Add e2e test for error reporting when vertex manager invokes plugin APIs Key: TEZ-2292 URL: https://issues.apache.org/jira/browse/TEZ-2292 Project: Apache Tez Issue Type: Task Reporter: Bikas Saha Assignee: Bikas Saha Priority: Blocker Fix For: 0.7.0 Attachments: TEZ-2292.1.patch, TEZ-2292.2.patch If the Vertex Manager has an error or cannot apply a required reconfiguration then it should be allowed to fail the vertex. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (TEZ-2322) Succeeded count wrong for Pig on Tez job, decreased 380 = 181
[ https://issues.apache.org/jira/browse/TEZ-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14505200#comment-14505200 ] Hitesh Shah commented on TEZ-2322: -- [~harisekhon] Based on the fact that the AM did crash and recover, the succeeded task count can go down between the 2 attempts. The reason for this is that we do not checkpoint/sync the state for each task completion but only at certain points ( for performance reasons as AM crashes are rare). For the most part, most tasks are recovered but in certain situations some tasks end up getting re-run if the recovery/state log had a lag. I think the succeeded count going down is fine but for the tasks could not be recovered in the second attempt, the failed attempt count should have been increased accordingly. Succeeded count wrong for Pig on Tez job, decreased 380 = 181 -- Key: TEZ-2322 URL: https://issues.apache.org/jira/browse/TEZ-2322 Project: Apache Tez Issue Type: Bug Affects Versions: 0.5.2 Environment: HDP 2.2 Reporter: Hari Sekhon Priority: Minor Attachments: attempt1_syslog_dag_1427546104095_0146_1, attempt2_syslog, attempt2_syslog_dag_1427546104095_0146_1, attempt2_syslog_dag_1427546104095_0146_1_post During a Pig on Tez job the number of succeeded tasks dropped from 380 = 181 as shown below: {code} 2015-04-15 15:09:56,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= 2015-04-15 15:10:16,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= 2015-04-15 15:10:36,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 380 Running: 58 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 16, diagnostics= 2015-04-15 15:10:56,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:11:16,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 181 Running: 724 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:11:36,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 182 Running: 723 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:11:56,993 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 184 Running: 721 Failed: 0 Killed: 0 FailedTaskAttempts: 10 KilledTaskAttempts: 89, diagnostics= 2015-04-15 15:12:16,992 [Timer-0] INFO org.apache.pig.backend.hadoop.executionengine.tez.TezJob - DAG Status: status=RUNNING, progress=TotalTasks: 905 Succeeded: 186 Running: 719 Failed: 0 {code} Now this may be because the tasks failed, some certainly did due to space exceptions having checked the logs, but surely once a task has finished successfully and is marked as succeeded it cannot then later be removed from the succeeded count? Perhaps the succeeded counter is incremented too early before the task results are really saved? KilledTaskAttempts jumped from 16 = 89 at the same time, but even this doesn't account for the large drop in number of succeeded tasks. There was also a noticeable jump in Running tasks from 58 = 724 at the same time which is suspicious, I'm pretty sure there was no contending job to finish and release so much more resource to this Tez job, so it's also unclear how the running count count have jumped up to significantly given the cluster hardware resources have been the same throughout. Hari Sekhon http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)