+ Dev for awareness

Thank you Szehon for your investigation Szehon. It also looks like
when TestMiniTezCliDriver
is not run, it times out. Logs are here:

http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-408/failed/TestMiniTezCliDriver/

Only thing of note, is the exception below.


2014-06-07 21:07:30,489 DEBUG rpc.DAGClientRPCImpl
(DAGClientRPCImpl.java:resetProxy(144)) - Resetting AM proxy for app:
application_1402200067997_0012 dag:dag_1402200067997_
12_000001 due to exception :
org.apache.tez.dag.api.TezException: com.google.protobuf.ServiceException:
org.apache.hadoop.ipc.RemoteException(org.apache.tez.dag.api.TezException):
No running dag at prese
nt
        at
org.apache.tez.dag.app.DAGAppMaster$DAGClientHandler.getDAG(DAGAppMaster.java:1035)
        at
org.apache.tez.dag.app.DAGAppMaster$DAGClientHandler.getDAGStatus(DAGAppMaster.java:1013)
        at
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.getDAGStatus(DAGClientAMProtocolBlockingPBServerImpl.java:79)
        at
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:8286)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

        at
org.apache.tez.dag.api.client.rpc.DAGClientRPCImpl.getDAGStatusViaAM(DAGClientRPCImpl.java:170)
        at
org.apache.tez.dag.api.client.rpc.DAGClientRPCImpl.getDAGStatus(DAGClientRPCImpl.java:83)
        at
org.apache.tez.mapreduce.client.YARNRunner.getJobStatus(YARNRunner.java:673)
        at org.apache.hadoop.mapreduce.Job$1.run(Job.java:314)
        at org.apache.hadoop.mapreduce.Job$1.run(Job.java:311)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
        at org.apache.hadoop.mapreduce.Job.getJobState(Job.java:347)
        at
org.apache.hadoop.mapred.JobClient$NetworkedJob.getJobState(JobClient.java:295)
        at
org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:243)
        at
org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:546)
        at
org.apache.hadoop.hive.ql.io.rcfile.merge.BlockMergeTask.execute(BlockMergeTask.java:216)
        at
org.apache.hadoop.hive.ql.exec.DDLTask.mergeFiles(DDLTask.java:520)
        at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:467)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:159)
        at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1507)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1273)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1091)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:914)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:904)
        at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:272)
        at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:224)
        at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:434)
        at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:370)
        at
org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:920)
        at
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.runTest(TestMiniTezCliDriver.java:644)
        at
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_create_merge_compressed(TestMiniTezCliDriver.java:368)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)

              at java.lang.reflect.Method.invoke(Method.java:606)
        at junit.framework.TestCase.runTest(TestCase.java:168)
        at junit.framework.TestCase.runBare(TestCase.java:134)
        at junit.framework.TestResult$1.protect(TestResult.java:110)
        at junit.framework.TestResult.runProtected(TestResult.java:128)
        at junit.framework.TestResult.run(TestResult.java:113)
        at junit.framework.TestCase.run(TestCase.java:124)
        at junit.framework.TestSuite.runTest(TestSuite.java:243)
        at junit.framework.TestSuite.run(TestSuite.java:238)
        at
org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
        at
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
        at
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
        at
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
        at
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
        at
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
        at
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
Caused by: com.google.protobuf.ServiceException:
org.apache.hadoop.ipc.RemoteException(org.apache.tez.dag.api.TezException):
No running dag at present
        at
org.apache.tez.dag.app.DAGAppMaster$DAGClientHandler.getDAG(DAGAppMaster.java:1035)
        at
org.apache.tez.dag.app.DAGAppMaster$DAGClientHandler.getDAGStatus(DAGAppMaster.java:1013)
        at
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.getDAGStatus(DAGClientAMProtocolBlockingPBServerImpl.java:79)
        at
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:8286)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:216)
        at com.sun.proxy.$Proxy86.getDAGStatus(Unknown Source)
        at
org.apache.tez.dag.api.client.rpc.DAGClientRPCImpl.getDAGStatusViaAM(DAGClientRPCImpl.java:165)
        ... 48 more
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.tez.dag.api.TezException):
No running dag at present
        at
org.apache.tez.dag.app.DAGAppMaster$DAGClientHandler.getDAG(DAGAppMaster.java:1035)
        at
org.apache.tez.dag.app.DAGAppMaster$DAGClientHandler.getDAGStatus(DAGAppMaster.java:1013)
        at
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolBlockingPBServerImpl.getDAGStatus(DAGClientAMProtocolBlockingPBServerImpl.java:79)
        at
org.apache.tez.dag.api.client.rpc.DAGClientAMProtocolRPC$DAGClientAMProtocol$2.callBlockingMethod(DAGClientAMProtocolRPC.java:8286)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

        at org.apache.hadoop.ipc.Client.call(Client.java:1410)
        at org.apache.hadoop.ipc.Client.call(Client.java:1363)
        at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
        ... 50 more


On Mon, Jun 9, 2014 at 1:13 PM, Szehon Ho <sze...@cloudera.com> wrote:

> It looks like JVM OOM crash during MiniTezCliDriver tests, or its
> otherwise crashing.  The 407 log has failures, but the 408 log is cut off.
>
>
> http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-407/failed/TestMiniTezCliDriver/maven-test.txt
>
> http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-408/failed/TestMiniTezCliDriver/maven-test.txt
>
> The MAVEN_OPTS is already set to "-XmX2g -XX:MaxPermSize=256M".  Do you
> guys know of any such issues?
>
> Thanks,
> Szehon
>
>
>
> On Sun, Jun 8, 2014 at 12:05 PM, Brock Noland <br...@cloudera.com> wrote:
>
>> Looks like it's failing to generate a to generate a test output:
>>
>>
>> http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-408/failed/TestMiniTezCliDriver/
>>
>>
>> http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-408/failed/TestMiniTezCliDriver/TestMiniTezCliDriver.txt
>>
>> exiting with 124 here:
>>
>> + wait 21961
>> + timeout 2h mvn -B -o test 
>> -Dmaven.repo.local=/home/hiveptest//ip-10-31-188-232-hiveptest-2/maven 
>> -Phadoop-2 -Phadoop-2 -Dtest=TestMiniTezCliDriver
>> + ret=124
>>
>>
>>
>>
>>
>> On Sun, Jun 8, 2014 at 11:25 AM, Ashutosh Chauhan <hashut...@apache.org>
>> wrote:
>>
>>> Build #407 ran MiniTezCliDriver
>>> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/407/testReport/org.apache.hadoop.hive.cli/
>>>
>>> but Build #408 didn't
>>> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/408/testReport/org.apache.hadoop.hive.cli/
>>>
>>>
>>> On Sat, Jun 7, 2014 at 12:25 PM, Szehon Ho <sze...@cloudera.com> wrote:
>>>
>>>> Sounds like there's randomness, either in PTest test-parser or in the
>>>> maven test itself.  In the history now, its running between 5633-5707,
>>>> which is similar to your range.
>>>>
>>>>
>>>> http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/394/testReport/history/
>>>>
>>>> I didnt see any in history without MiniTezCLIDriver, can you point me
>>>> to a build no. if you see one?  If nobody else knows immediately, I can dig
>>>> deeper at it next week to try to find out.
>>>>
>>>>
>>>> On Sat, Jun 7, 2014 at 9:00 AM, Ashutosh Chauhan <hashut...@apache.org>
>>>> wrote:
>>>>
>>>>> I noticed that PTest2 framework runs different number of tests on
>>>>> various runs. e.g., on yesterday's runs I saw it ran 5585 & 5510 tests on
>>>>> subsequent runs. In particular, it seems its running MiniTezCliDriver 
>>>>> tests
>>>>> in only half the runs. Anyone observed this?
>>>>>
>>>>>
>>>>>

Reply via email to