Re: Loading tpc-ds

2017-07-31 Thread Tim Armstrong
It looks like the same error:

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/test-warehouse/tpcds.store_sales/.hive-staging_hive_2017-07-31_23-55-05_306_8385818677737494274-760/_task_tmp.-ext-1/ss_sold_date_sk=2450988/_tmp.00_0
could only be replicated to 0 nodes instead of minReplication (=1).  There
are 3 datanode(s) running and no node(s) are excluded in this operation.
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3385)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
at
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211)

at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:751)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
... 8 more
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
/test-warehouse/tpcds.store_sales/.hive-staging_hive_2017-07-31_23-55-05_306_8385818677737494274-760/_task_tmp.-ext-1/ss_sold_date_sk=2450988/_tmp.00_0
could only be replicated to 0 nodes instead of minReplication (=1).  There
are 3 datanode(s) running and no node(s) are excluded in this operation.
at
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1724)
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3385)
at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:683)
at
org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:214)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:495)
at
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211)

at org.apache.hadoop.ipc.Client.call(Client.java:1502)
at org.apache.hadoop.ipc.Client.call(Client.java:1439)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:413)
at sun.reflect.GeneratedMethodAccessor68.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:260)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy13.addBlock(Unknown Source)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1814)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1610)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:773)
2017-07-31 23:55:38,630 ERROR exec.Task
(SessionState.java:printError(1103)) - Ended Job = job_local1252085428_0826

Re: Loading tpc-ds

2017-07-31 Thread Tim Armstrong
I saw this on GVO:
https://jenkins.impala.io/job/ubuntu-14.04-from-scratch/1807/

I haven't pulled out the error from hive.log yet - for some reason that log
is almost 500mb.

On Thu, Jul 13, 2017 at 3:52 PM, Tim Armstrong 
wrote:

> I'm not sure exactly what is going on, but I can confirm that I was able
> to load data on Ubuntu 16.04 with OpenJDK 8 a while back.
>
> On Thu, Jul 13, 2017 at 2:58 PM, Jim Apple  wrote:
>
>> I also see this with the Oracle JDK. I have also now checked I am not
>> running out of memory.
>>
>> Oracle JDK7 is harder to get one's hands on, and OpenJDK7 isn't packaged
>> by
>> canonical for Ubuntu 16.04.
>>
>> On Wed, Jul 12, 2017 at 11:20 PM, Jim Apple  wrote:
>>
>> > I'm getting data loading errors on Ubuntu 16.04 in TPC-DS. The terminal
>> > shows:
>> >
>> > ERROR : FAILED: Execution Error, return code 2 from
>> > org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>> >
>> > logs/cluster/hive/hive.log shows the error below, which previous bugs
>> have
>> > called an issue with the disk being out of space, but my disk has at
>> least
>> > 45GB left on it
>> >
>> > IMPALA-3246, IMPALA-2856, IMPALA-2617
>> >
>> > I see this with openJDK8. I haven't tried Oracle's JDK yet.
>> >
>> > Has anyone else seen this and been able to diagnose it as something that
>> > doesn't mean a full disk?
>> >
>> >
>> > FATAL ExecReducer (ExecReducer.java:reduce(264)) -
>> > org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
>> > while processing row (tag=0) {"key":{},"value":{"_col0":
>> > 48147,"_col1":17805,"_col2":27944,"_col3":606992,"_col4":
>> > 3193,"_col5":16641,"_col6":10,"_col7":209,"_col8":44757,"_
>> > col9":20,"_col10":5.51,"_col11":9.36,"_col12":9.17,"_
>> > col13":0,"_col14":183.4,"_col15":110.2,"_col16":187.2,"_
>> > col17":3.66,"_col18":0,"_col19":183.4,"_col20":187.06,"
>> > _col21":73.2,"_col22":2452013}}
>> > at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(
>> > ExecReducer.java:253)
>> > at org.apache.hadoop.mapred.ReduceTask.runOldReducer(
>> > ReduceTask.java:444)
>> > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>> > at org.apache.hadoop.mapred.LocalJobRunner$Job$
>> > ReduceTaskRunnable.run(LocalJobRunner.java:346)
>> > at java.util.concurrent.Executors$RunnableAdapter.
>> > call(Executors.java:511)
>> > at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>> > at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> > ThreadPoolExecutor.java:1142)
>> > at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> > ThreadPoolExecutor.java:617)
>> > at java.lang.Thread.run(Thread.java:748)
>> > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
>> > org.apache.hadoop.ipc.RemoteException(java.io.IOException): File
>> > /test-warehouse/tpcds.store_sales/.hive-staging_hive_2017-
>> > 07-12_22-51-18_139_3687815919405186455-760/_task_
>> > tmp.-ext-1/ss_sold_date_sk=2452013/_tmp.01_0 could only be
>> > replicated to 0 nodes instead of minReplication (=1).  There are 3
>> > datanode(s) running and no node(s) are excluded in this operation.
>> > at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.
>> > chooseTarget4NewBlock(BlockManager.java:1724)
>> > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.
>> > getAdditionalBlock(FSNamesystem.java:3385)
>> > at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.
>> > addBlock(NameNodeRpcServer.java:683)
>> > at org.apache.hadoop.hdfs.server.namenode.
>> > AuthorizationProviderProxyClientProtocol.addBlock(
>> > AuthorizationProviderProxyClientProtocol.java:214)
>> > at org.apache.hadoop.hdfs.protocolPB.
>> > ClientNamenodeProtocolServerSideTranslatorPB.addBlock(
>> > ClientNamenodeProtocolServerSideTranslatorPB.java:495)
>> > at org.apache.hadoop.hdfs.protocol.proto.
>> > ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBl
>> ockingMethod(
>> > ClientNamenodeProtocolProtos.java)
>> > at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$
>> > ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
>> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
>> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217)
>> > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213)
>> > at java.security.AccessController.doPrivileged(Native Method)
>> > at javax.security.auth.Subject.doAs(Subject.java:422)
>> > at org.apache.hadoop.security.UserGroupInformation.doAs(
>> > UserGroupInformation.java:1917)
>> > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2211)
>> >
>> > at org.apache.hadoop.hive.ql.exec.FileSinkOperator.
>> > processOp(FileSinkOperator.java:751)
>> > at org.apache.hadoop.hive.ql.exec.Operator.forward(
>> > Operator.java:815)
>> > at 

Re: Reminder: "newbie" label on tickets

2017-07-31 Thread yu feng
Great ! I will pick up some and do some contribution。

2017-08-01 10:04 GMT+08:00 Jim Apple :

> https://issues.apache.org/jira/browse/IMPALA-5742?jql=
> project%20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%
> 22In%20Progress%22%2C%20Reopened)%20AND%20labels%20%
> 3D%20newbie%20AND%20assignee%20in%20(EMPTY)
>
> On Mon, Jul 31, 2017 at 7:01 PM, yu feng  wrote:
>
> > As a newbie to impala community, I have done one JIRA, where is the
> newbie
> > tickets which I can try to solve it.  Thanks a lot
> >
> > 2017-07-31 23:57 GMT+08:00 Tim Armstrong :
> >
> > > Let's also make sure that everything with the "newbie" label is
> actually
> > > straightforward and has a clear end-goal. Oh, and is reasonably issue
> to
> > > test.
> > >
> > > E.g. adding a built-in function is a good one if the semantics of the
> > > function are clearly documented in the JIRA and there aren't any
> > potential
> > > compatibility issues.
> > >
> > > We've seen a few new contributors pick up JIRAs with the newbie that
> > > sounded easy but were actually tricky to get right - that's not a great
> > > experience.
> > >
> > >
> > >
> > > On Sun, Jul 30, 2017 at 1:30 PM, Jim Apple 
> wrote:
> > >
> > > > As a reminder, when you file a ticket, you can label tickets that
> could
> > > be
> > > > completed by a first-time Impala contributor "newbie". This can be a
> > tool
> > > > to help grow the community.
> > > >
> > >
> >
>


Re: Reminder: "newbie" label on tickets

2017-07-31 Thread Jim Apple
https://issues.apache.org/jira/browse/IMPALA-5742?jql=project%20%3D%20IMPALA%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%20AND%20labels%20%3D%20newbie%20AND%20assignee%20in%20(EMPTY)

On Mon, Jul 31, 2017 at 7:01 PM, yu feng  wrote:

> As a newbie to impala community, I have done one JIRA, where is the newbie
> tickets which I can try to solve it.  Thanks a lot
>
> 2017-07-31 23:57 GMT+08:00 Tim Armstrong :
>
> > Let's also make sure that everything with the "newbie" label is actually
> > straightforward and has a clear end-goal. Oh, and is reasonably issue to
> > test.
> >
> > E.g. adding a built-in function is a good one if the semantics of the
> > function are clearly documented in the JIRA and there aren't any
> potential
> > compatibility issues.
> >
> > We've seen a few new contributors pick up JIRAs with the newbie that
> > sounded easy but were actually tricky to get right - that's not a great
> > experience.
> >
> >
> >
> > On Sun, Jul 30, 2017 at 1:30 PM, Jim Apple  wrote:
> >
> > > As a reminder, when you file a ticket, you can label tickets that could
> > be
> > > completed by a first-time Impala contributor "newbie". This can be a
> tool
> > > to help grow the community.
> > >
> >
>


Re: Reminder: "newbie" label on tickets

2017-07-31 Thread yu feng
As a newbie to impala community, I have done one JIRA, where is the newbie
tickets which I can try to solve it.  Thanks a lot

2017-07-31 23:57 GMT+08:00 Tim Armstrong :

> Let's also make sure that everything with the "newbie" label is actually
> straightforward and has a clear end-goal. Oh, and is reasonably issue to
> test.
>
> E.g. adding a built-in function is a good one if the semantics of the
> function are clearly documented in the JIRA and there aren't any potential
> compatibility issues.
>
> We've seen a few new contributors pick up JIRAs with the newbie that
> sounded easy but were actually tricky to get right - that's not a great
> experience.
>
>
>
> On Sun, Jul 30, 2017 at 1:30 PM, Jim Apple  wrote:
>
> > As a reminder, when you file a ticket, you can label tickets that could
> be
> > completed by a first-time Impala contributor "newbie". This can be a tool
> > to help grow the community.
> >
>


[REMINDER] Policies around Publicity & Press

2017-07-31 Thread John D. Ament
All Podlings,

In recent days I've been contacted about several publicity issues that have
gone a bit off kilter.  I wanted to remind podlings two very import
policies.

1. Podlings MUST coordinate with the Public Relations Committee with all
publicity activities.
2. The Press Team MUST review any press releases or similar communication
referencing a podling before it is published.

Relevant Links:

https://incubator.apache.org/guides/branding.html#publicity_activities
https://incubator.apache.org/guides/branding.html#publicity_throughout_the_incubation_process

Please reach out with any concerns.

Regards,

John D. Ament
VP, Apache Incubator


Re: Reminder: "newbie" label on tickets

2017-07-31 Thread Tim Armstrong
Let's also make sure that everything with the "newbie" label is actually
straightforward and has a clear end-goal. Oh, and is reasonably issue to
test.

E.g. adding a built-in function is a good one if the semantics of the
function are clearly documented in the JIRA and there aren't any potential
compatibility issues.

We've seen a few new contributors pick up JIRAs with the newbie that
sounded easy but were actually tricky to get right - that's not a great
experience.



On Sun, Jul 30, 2017 at 1:30 PM, Jim Apple  wrote:

> As a reminder, when you file a ticket, you can label tickets that could be
> completed by a first-time Impala contributor "newbie". This can be a tool
> to help grow the community.
>