Re: Loading tpc-ds
Yes yours might have been different. Looks like Tim's gvo and mine failed with very similar looking errors though. On Thu, Aug 3, 2017 at 9:52 PM, Jim Applewrote: > When I saw this, there was a "FATAL" in hive.log, so perhaps they are > different. > > https://issues.apache.org/jira/browse/IMPALA-5663 > > https://jenkins.impala.io/job/ubuntu-14.04-from-scratch/1827/artifact/Impala/logs_static/logs/cluster/hive/hive.log/*view*/ > > On Thu, Aug 3, 2017 at 9:09 PM, Matthew Jacobs wrote: > >> Just saw this error again. I filed IMPALA-5765. >> >> On Mon, Jul 31, 2017 at 8:05 PM, Tim Armstrong >> wrote: >> > It looks like the same error: >> > >> > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: >> > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File >> > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017- >> 07-31_23-55-05_306_8385818677737494274-760/_task_ >> tmp.-ext-1/ss_sold_date_sk=2450988/_tmp.00_0 >> > could only be replicated to 0 nodes instead of minReplication (=1). >> There >> > are 3 datanode(s) running and no node(s) are excluded in this operation. >> > at >> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager. >> chooseTarget4NewBlock(BlockManager.java:1724) >> > at >> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock( >> FSNamesystem.java:3385) >> > at >> > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer. >> addBlock(NameNodeRpcServer.java:683) >> > at >> > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClie >> ntProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214) >> > at >> > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSi >> deTranslatorPB.addBlock(ClientNamenodeProtocolServerSi >> deTranslatorPB.java:495) >> > at >> > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ >> ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos. >> java) >> > at >> > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call( >> ProtobufRpcEngine.java:617) >> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) >> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) >> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) >> > at java.security.AccessController.doPrivileged(Native Method) >> > at javax.security.auth.Subject.doAs(Subject.java:415) >> > at >> > org.apache.hadoop.security.UserGroupInformation.doAs( >> UserGroupInformation.java:1917) >> > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211) >> > >> > at >> > org.apache.hadoop.hive.ql.exec.FileSinkOperator. >> processOp(FileSinkOperator.java:751) >> > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) >> > at >> > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp( >> SelectOperator.java:84) >> > at >> > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce( >> ExecReducer.java:244) >> > ... 8 more >> > Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): >> File >> > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017- >> 07-31_23-55-05_306_8385818677737494274-760/_task_ >> tmp.-ext-1/ss_sold_date_sk=2450988/_tmp.00_0 >> > could only be replicated to 0 nodes instead of minReplication (=1). >> There >> > are 3 datanode(s) running and no node(s) are excluded in this operation. >> > at >> > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager. >> chooseTarget4NewBlock(BlockManager.java:1724) >> > at >> > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock( >> FSNamesystem.java:3385) >> > at >> > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer. >> addBlock(NameNodeRpcServer.java:683) >> > at >> > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClie >> ntProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214) >> > at >> > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSi >> deTranslatorPB.addBlock(ClientNamenodeProtocolServerSi >> deTranslatorPB.java:495) >> > at >> > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ >> ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos. >> java) >> > at >> > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call( >> ProtobufRpcEngine.java:617) >> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) >> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) >> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) >> > at java.security.AccessController.doPrivileged(Native Method) >> > at javax.security.auth.Subject.doAs(Subject.java:415) >> > at >> > org.apache.hadoop.security.UserGroupInformation.doAs( >> UserGroupInformation.java:1917) >> > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211) >> > >> > at org.apache.hadoop.ipc.Client.call(Client.java:1502) >> > at org.apache.hadoop.ipc.Client.call(Client.java:1439)
Re: Loading tpc-ds
Just saw this error again. I filed IMPALA-5765. On Mon, Jul 31, 2017 at 8:05 PM, Tim Armstrongwrote: > It looks like the same error: > > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017-07-31_23-55-05_306_8385818677737494274-760/_task_tmp.-ext-1/ss_sold_date_sk=2450988/_tmp.00_0 > could only be replicated to 0 nodes instead of minReplication (=1). There > are 3 datanode(s) running and no node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3385) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211) > > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:751) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) > at > org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) > ... 8 more > Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017-07-31_23-55-05_306_8385818677737494274-760/_task_tmp.-ext-1/ss_sold_date_sk=2450988/_tmp.00_0 > could only be replicated to 0 nodes instead of minReplication (=1). There > are 3 datanode(s) running and no node(s) are excluded in this operation. > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3385) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211) > > at org.apache.hadoop.ipc.Client.call(Client.java:1502) > at org.apache.hadoop.ipc.Client.call(Client.java:1439) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > at com.sun.proxy.$Proxy12.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:413) > at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:260) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > at com.sun.proxy.$Proxy13.addBlock(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1814) > at >
Re: Loading tpc-ds
It looks like the same error: Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017-07-31_23-55-05_306_8385818677737494274-760/_task_tmp.-ext-1/ss_sold_date_sk=2450988/_tmp.00_0 could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3385) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:751) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244) ... 8 more Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017-07-31_23-55-05_306_8385818677737494274-760/_task_tmp.-ext-1/ss_sold_date_sk=2450988/_tmp.00_0 could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3385) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211) at org.apache.hadoop.ipc.Client.call(Client.java:1502) at org.apache.hadoop.ipc.Client.call(Client.java:1439) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) at com.sun.proxy.$Proxy12.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:413) at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:260) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) at com.sun.proxy.$Proxy13.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1814) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1610) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:773) 2017-07-31 23:55:38,630 ERROR exec.Task (SessionState.java:printError(1103)) - Ended Job = job_local1252085428_0826
Re: Loading tpc-ds
I saw this on GVO: https://jenkins.impala.io/job/ubuntu-14.04-from-scratch/1807/ I haven't pulled out the error from hive.log yet - for some reason that log is almost 500mb. On Thu, Jul 13, 2017 at 3:52 PM, Tim Armstrongwrote: > I'm not sure exactly what is going on, but I can confirm that I was able > to load data on Ubuntu 16.04 with OpenJDK 8 a while back. > > On Thu, Jul 13, 2017 at 2:58 PM, Jim Apple wrote: > >> I also see this with the Oracle JDK. I have also now checked I am not >> running out of memory. >> >> Oracle JDK7 is harder to get one's hands on, and OpenJDK7 isn't packaged >> by >> canonical for Ubuntu 16.04. >> >> On Wed, Jul 12, 2017 at 11:20 PM, Jim Apple wrote: >> >> > I'm getting data loading errors on Ubuntu 16.04 in TPC-DS. The terminal >> > shows: >> > >> > ERROR : FAILED: Execution Error, return code 2 from >> > org.apache.hadoop.hive.ql.exec.mr.MapRedTask >> > >> > logs/cluster/hive/hive.log shows the error below, which previous bugs >> have >> > called an issue with the disk being out of space, but my disk has at >> least >> > 45GB left on it >> > >> > IMPALA-3246, IMPALA-2856, IMPALA-2617 >> > >> > I see this with openJDK8. I haven't tried Oracle's JDK yet. >> > >> > Has anyone else seen this and been able to diagnose it as something that >> > doesn't mean a full disk? >> > >> > >> > FATAL ExecReducer (ExecReducer.java:reduce(264)) - >> > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error >> > while processing row (tag=0) {"key":{},"value":{"_col0": >> > 48147,"_col1":17805,"_col2":27944,"_col3":606992,"_col4": >> > 3193,"_col5":16641,"_col6":10,"_col7":209,"_col8":44757,"_ >> > col9":20,"_col10":5.51,"_col11":9.36,"_col12":9.17,"_ >> > col13":0,"_col14":183.4,"_col15":110.2,"_col16":187.2,"_ >> > col17":3.66,"_col18":0,"_col19":183.4,"_col20":187.06," >> > _col21":73.2,"_col22":2452013}} >> > at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce( >> > ExecReducer.java:253) >> > at org.apache.hadoop.mapred.ReduceTask.runOldReducer( >> > ReduceTask.java:444) >> > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) >> > at org.apache.hadoop.mapred.LocalJobRunner$Job$ >> > ReduceTaskRunnable.run(LocalJobRunner.java:346) >> > at java.util.concurrent.Executors$RunnableAdapter. >> > call(Executors.java:511) >> > at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> > at java.util.concurrent.ThreadPoolExecutor.runWorker( >> > ThreadPoolExecutor.java:1142) >> > at java.util.concurrent.ThreadPoolExecutor$Worker.run( >> > ThreadPoolExecutor.java:617) >> > at java.lang.Thread.run(Thread.java:748) >> > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: >> > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File >> > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017- >> > 07-12_22-51-18_139_3687815919405186455-760/_task_ >> > tmp.-ext-1/ss_sold_date_sk=2452013/_tmp.01_0 could only be >> > replicated to 0 nodes instead of minReplication (=1). There are 3 >> > datanode(s) running and no node(s) are excluded in this operation. >> > at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager. >> > chooseTarget4NewBlock(BlockManager.java:1724) >> > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. >> > getAdditionalBlock(FSNamesystem.java:3385) >> > at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer. >> > addBlock(NameNodeRpcServer.java:683) >> > at org.apache.hadoop.hdfs.server.namenode. >> > AuthorizationProviderProxyClientProtocol.addBlock( >> > AuthorizationProviderProxyClientProtocol.java:214) >> > at org.apache.hadoop.hdfs.protocolPB. >> > ClientNamenodeProtocolServerSideTranslatorPB.addBlock( >> > ClientNamenodeProtocolServerSideTranslatorPB.java:495) >> > at org.apache.hadoop.hdfs.protocol.proto. >> > ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBl >> ockingMethod( >> > ClientNamenodeProtocolProtos.java) >> > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ >> > ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) >> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) >> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) >> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) >> > at java.security.AccessController.doPrivileged(Native Method) >> > at javax.security.auth.Subject.doAs(Subject.java:422) >> > at org.apache.hadoop.security.UserGroupInformation.doAs( >> > UserGroupInformation.java:1917) >> > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211) >> > >> > at org.apache.hadoop.hive.ql.exec.FileSinkOperator. >> > processOp(FileSinkOperator.java:751) >> > at org.apache.hadoop.hive.ql.exec.Operator.forward( >> > Operator.java:815) >> > at
Re: Loading tpc-ds
I'm not sure exactly what is going on, but I can confirm that I was able to load data on Ubuntu 16.04 with OpenJDK 8 a while back. On Thu, Jul 13, 2017 at 2:58 PM, Jim Applewrote: > I also see this with the Oracle JDK. I have also now checked I am not > running out of memory. > > Oracle JDK7 is harder to get one's hands on, and OpenJDK7 isn't packaged by > canonical for Ubuntu 16.04. > > On Wed, Jul 12, 2017 at 11:20 PM, Jim Apple wrote: > > > I'm getting data loading errors on Ubuntu 16.04 in TPC-DS. The terminal > > shows: > > > > ERROR : FAILED: Execution Error, return code 2 from > > org.apache.hadoop.hive.ql.exec.mr.MapRedTask > > > > logs/cluster/hive/hive.log shows the error below, which previous bugs > have > > called an issue with the disk being out of space, but my disk has at > least > > 45GB left on it > > > > IMPALA-3246, IMPALA-2856, IMPALA-2617 > > > > I see this with openJDK8. I haven't tried Oracle's JDK yet. > > > > Has anyone else seen this and been able to diagnose it as something that > > doesn't mean a full disk? > > > > > > FATAL ExecReducer (ExecReducer.java:reduce(264)) - > > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error > > while processing row (tag=0) {"key":{},"value":{"_col0": > > 48147,"_col1":17805,"_col2":27944,"_col3":606992,"_col4": > > 3193,"_col5":16641,"_col6":10,"_col7":209,"_col8":44757,"_ > > col9":20,"_col10":5.51,"_col11":9.36,"_col12":9.17,"_ > > col13":0,"_col14":183.4,"_col15":110.2,"_col16":187.2,"_ > > col17":3.66,"_col18":0,"_col19":183.4,"_col20":187.06," > > _col21":73.2,"_col22":2452013}} > > at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce( > > ExecReducer.java:253) > > at org.apache.hadoop.mapred.ReduceTask.runOldReducer( > > ReduceTask.java:444) > > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > > at org.apache.hadoop.mapred.LocalJobRunner$Job$ > > ReduceTaskRunnable.run(LocalJobRunner.java:346) > > at java.util.concurrent.Executors$RunnableAdapter. > > call(Executors.java:511) > > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > > at java.util.concurrent.ThreadPoolExecutor.runWorker( > > ThreadPoolExecutor.java:1142) > > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > > ThreadPoolExecutor.java:617) > > at java.lang.Thread.run(Thread.java:748) > > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017- > > 07-12_22-51-18_139_3687815919405186455-760/_task_ > > tmp.-ext-1/ss_sold_date_sk=2452013/_tmp.01_0 could only be > > replicated to 0 nodes instead of minReplication (=1). There are 3 > > datanode(s) running and no node(s) are excluded in this operation. > > at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager. > > chooseTarget4NewBlock(BlockManager.java:1724) > > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. > > getAdditionalBlock(FSNamesystem.java:3385) > > at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer. > > addBlock(NameNodeRpcServer.java:683) > > at org.apache.hadoop.hdfs.server.namenode. > > AuthorizationProviderProxyClientProtocol.addBlock( > > AuthorizationProviderProxyClientProtocol.java:214) > > at org.apache.hadoop.hdfs.protocolPB. > > ClientNamenodeProtocolServerSideTranslatorPB.addBlock( > > ClientNamenodeProtocolServerSideTranslatorPB.java:495) > > at org.apache.hadoop.hdfs.protocol.proto. > > ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2. > callBlockingMethod( > > ClientNamenodeProtocolProtos.java) > > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ > > ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) > > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:422) > > at org.apache.hadoop.security.UserGroupInformation.doAs( > > UserGroupInformation.java:1917) > > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211) > > > > at org.apache.hadoop.hive.ql.exec.FileSinkOperator. > > processOp(FileSinkOperator.java:751) > > at org.apache.hadoop.hive.ql.exec.Operator.forward( > > Operator.java:815) > > at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp( > > SelectOperator.java:84) > > at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce( > > ExecReducer.java:244) > > >
Re: Loading tpc-ds
I also see this with the Oracle JDK. I have also now checked I am not running out of memory. Oracle JDK7 is harder to get one's hands on, and OpenJDK7 isn't packaged by canonical for Ubuntu 16.04. On Wed, Jul 12, 2017 at 11:20 PM, Jim Applewrote: > I'm getting data loading errors on Ubuntu 16.04 in TPC-DS. The terminal > shows: > > ERROR : FAILED: Execution Error, return code 2 from > org.apache.hadoop.hive.ql.exec.mr.MapRedTask > > logs/cluster/hive/hive.log shows the error below, which previous bugs have > called an issue with the disk being out of space, but my disk has at least > 45GB left on it > > IMPALA-3246, IMPALA-2856, IMPALA-2617 > > I see this with openJDK8. I haven't tried Oracle's JDK yet. > > Has anyone else seen this and been able to diagnose it as something that > doesn't mean a full disk? > > > FATAL ExecReducer (ExecReducer.java:reduce(264)) - > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error > while processing row (tag=0) {"key":{},"value":{"_col0": > 48147,"_col1":17805,"_col2":27944,"_col3":606992,"_col4": > 3193,"_col5":16641,"_col6":10,"_col7":209,"_col8":44757,"_ > col9":20,"_col10":5.51,"_col11":9.36,"_col12":9.17,"_ > col13":0,"_col14":183.4,"_col15":110.2,"_col16":187.2,"_ > col17":3.66,"_col18":0,"_col19":183.4,"_col20":187.06," > _col21":73.2,"_col22":2452013}} > at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce( > ExecReducer.java:253) > at org.apache.hadoop.mapred.ReduceTask.runOldReducer( > ReduceTask.java:444) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at org.apache.hadoop.mapred.LocalJobRunner$Job$ > ReduceTaskRunnable.run(LocalJobRunner.java:346) > at java.util.concurrent.Executors$RunnableAdapter. > call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017- > 07-12_22-51-18_139_3687815919405186455-760/_task_ > tmp.-ext-1/ss_sold_date_sk=2452013/_tmp.01_0 could only be > replicated to 0 nodes instead of minReplication (=1). There are 3 > datanode(s) running and no node(s) are excluded in this operation. > at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager. > chooseTarget4NewBlock(BlockManager.java:1724) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem. > getAdditionalBlock(FSNamesystem.java:3385) > at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer. > addBlock(NameNodeRpcServer.java:683) > at org.apache.hadoop.hdfs.server.namenode. > AuthorizationProviderProxyClientProtocol.addBlock( > AuthorizationProviderProxyClientProtocol.java:214) > at org.apache.hadoop.hdfs.protocolPB. > ClientNamenodeProtocolServerSideTranslatorPB.addBlock( > ClientNamenodeProtocolServerSideTranslatorPB.java:495) > at org.apache.hadoop.hdfs.protocol.proto. > ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod( > ClientNamenodeProtocolProtos.java) > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ > ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at org.apache.hadoop.security.UserGroupInformation.doAs( > UserGroupInformation.java:1917) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211) > > at org.apache.hadoop.hive.ql.exec.FileSinkOperator. > processOp(FileSinkOperator.java:751) > at org.apache.hadoop.hive.ql.exec.Operator.forward( > Operator.java:815) > at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp( > SelectOperator.java:84) > at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce( > ExecReducer.java:244) >
Loading tpc-ds
I'm getting data loading errors on Ubuntu 16.04 in TPC-DS. The terminal shows: ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask logs/cluster/hive/hive.log shows the error below, which previous bugs have called an issue with the disk being out of space, but my disk has at least 45GB left on it IMPALA-3246, IMPALA-2856, IMPALA-2617 I see this with openJDK8. I haven't tried Oracle's JDK yet. Has anyone else seen this and been able to diagnose it as something that doesn't mean a full disk? FATAL ExecReducer (ExecReducer.java:reduce(264)) - org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{},"value":{"_col0":48147,"_col1":17805,"_col2":27944,"_col3":606992,"_col4":3193,"_col5":16641,"_col6":10,"_col7":209,"_col8":44757,"_col9":20,"_col10":5.51,"_col11":9.36,"_col12":9.17,"_col13":0,"_col14":183.4,"_col15":110.2,"_col16":187.2,"_col17":3.66,"_col18":0,"_col19":183.4,"_col20":187.06,"_col21":73.2,"_col22":2452013}} at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:253) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:346) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017-07-12_22-51-18_139_3687815919405186455-760/_task_tmp.-ext-1/ss_sold_date_sk=2452013/_tmp.01_0 could only be replicated to 0 nodes instead of minReplication (=1). There are 3 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3385) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683) at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:751) at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815) at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)