Hi Ning,
I'll see if I can cause it with a smaller dataset.
I also noticed this on a simple "select coun(1) from .." query which is perhaps
related:
java.lang.RuntimeException: Hive Runtime Error while closing operators
at
org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:358)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.NullPointerException
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:897)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:539)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
at
org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:335)
... 4 more
Caused by: java.lang.NullPointerException
at java.lang.System.arraycopy(Native Method)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1108)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer$Buffer.write(MapTask.java:1025)
at java.io.DataOutputStream.writeInt(DataOutputStream.java:180)
at org.apache.hadoop.io.BytesWritable.write(BytesWritable.java:159)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:90)
at
org.apache.hadoop.io.serializer.WritableSerialization$WritableSerializer.serialize(WritableSerialization.java:77)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:892)
at
org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
at
org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:217)
at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:456)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:696)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:841)
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:869)
... 9 more
It's just some tasks which fail but the whole job runs correctly so feels more
like an hadoop issue to me.
The table is partitioned but never contains more then 20 partitions. With in
total roughly 14M records.
Thanks,
Bennie.
________________________________
From: Ning Zhang [mailto:[email protected]]
Sent: Thursday, June 03, 2010 5:46 PM
To: '[email protected]'
Subject: Re: java.lang.OutOfMemoryError: PermGen space when running as a
service.
Thanks for the detailed report Bennie. There might be memory leak in jdbc or
hiveserver itself. Are your queries roughly the same (in terms of query size
and number of partitions involved)? Better off can you come up a simple test
case (serious of queries) that expose the mem leak problem?
Thanks,
Ning
------
Sent from my blackberry
________________________________
From: Bennie Schut <[email protected]>
To: '[email protected]' <[email protected]>
Sent: Thu Jun 03 02:20:38 2010
Subject: java.lang.OutOfMemoryError: PermGen space when running as a service.
Hi guys,
When I run hive as a service like this: "hive --service hiveserver"
I get these errors after about a day of running a lot of queries:
java.lang.OutOfMemoryError: PermGen space
I use:
trunk hive from about a week ago with "-XX:MaxPermSize=128m".
hadoop 0.20.2
mysql 5.1.45 meta store
Some UDF's on each query.
I've now increased it to 512 to see if it helps.
I've also made a jmap dump (85M) while it was broken
9931 instances of class org.apache.hadoop.hive.ql.exec.ColumnInfo
4466 instances of class org.apache.hadoop.hive.ql.plan.ExprNodeColumnDesc
3127 instances of class org.apache.hadoop.hive.metastore.api.FieldSchema
3127 instances of class org.apache.hadoop.hive.metastore.api.FieldSchema$Isset
2755 instances of class org.apache.commons.logging.impl.Log4JLogger
2497 instances of class org.datanucleus.util.WeakValueMap$WeakValueReference
1404 instances of class org.apache.hadoop.ipc.Client$Call
1404 instances of class org.apache.hadoop.ipc.RPC$Invocation
1394 instances of class org.apache.hadoop.ipc.RemoteException
985 instances of class [Lorg.datanucleus.plugin.ConfigurationElement;
945 instances of class org.datanucleus.plugin.ConfigurationElement
907 instances of class
org.apache.hadoop.hive.ql.hooks.LineageInfo$BaseColumnInfo
792 instances of class
[Lorg.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
767 instances of class org.apache.hadoop.hive.ql.hooks.LineageInfo$Dependency
730 instances of class org.antlr.runtime.BitSet
618 instances of class
com.mysql.jdbc.ConnectionPropertiesImpl$BooleanConnectionProperty
523 instances of class
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$IdentityConverter
500 instances of class org.datanucleus.sco.backed.Map
464 instances of class org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge
I'm not sure if this is what you expect to see or if something is sticking out
here?
Thanks in advance for any ideas on this.
Bennie.