In hive 1.2.1 the automatic estimation of buffer size happens only if column 
count is >1000.
You need https://issues.apache.org/jira/browse/HIVE-11807 for automatic 
estimation by default or >=Hive 2.0 release.

Are you using dynamic partitioning in hive?

Thanks
Prasanth

On Aug 30, 2016, at 3:22 PM, Hank baker 
<hankbake...@gmail.com<mailto:hankbake...@gmail.com>> wrote:

Hi Prasanth,

Thanks for your quick reply. As I mentioned in the previous mail, this was the 
same stack trace in about 60 failed reducers. I am using Hive 1.2.1, not sure 
which newer version you are referring to.

But exactly as you pointed out, When I tried to reproduce this issue on my 
local setup by simply writing a large number of column, the stacktrace did vary.

Also, from the WriterImpl code, it appears that the stripes have already been 
flushed before metadata is written. I may be mistaken, please correct me if I'm 
wrong. This is one the reasons I believe that this is more than just a simple 
memory issue related to columns.


On Wed, Aug 31, 2016 at 3:42 AM, Prasanth Jayachandran 
<pjayachand...@hortonworks.com<mailto:pjayachand...@hortonworks.com>> wrote:
Under memory pressure, the stack trace of OOM can be different depending on who 
is requesting more memory when the memory is already full. That is the reason 
you are seeing OOM in writeMetadata (it may happen in other places as well). 
When dealing with thousands of columns its better to set 
hiv.exec.orc.default.buffer.size to lower value until you can avoid OOM. 
Depending on the version of hive you are using, this may be set automatically 
for you. In older hive versions, if number of columns is >1000 buffer size will 
be automatically chosen. In newer version, this limit is removed and orc writer 
will figure out the optimal buffer size based on stripe size, available memory 
and number of columns.

Thanks
Prasanth


On Aug 30, 2016, at 3:04 PM, Hank baker 
<hankbake...@gmail.com<mailto:hankbake...@gmail.com>> wrote:

Hi all,

I'm trying to run a map reduce job to convert csv data into orc using the 
OrcNewOutputFormat (reduce is required to satisfy some partitioning logic) but 
getting an OOM error at reduce phase (during merge to be exact) with the below 
attached stacktrace for one particular table which has about 800 columns and 
this error seems common across all reducers(minimum reducer input records is 
about 20, max. is about 100 mil). I am trying to figure out the exact cause of 
the error since I have use the same job to convert tables with 100-10000 
columns without any memory or config changes.

What concerns me in the stack trace is this line:

        at 
org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io/>.orc.WriterImpl.writeMetadata(WriterImpl.java:2327)

Why is it going OOM while trying to write MetaData ?

I originally believed this was simply due to the number of open buffers (as 
mentioned in 
http://mail-archives.apache.org/mod_mbox/hive-dev/201410.mbox/%3c543d5eb6.2000...@apache.org%3E).So
 I wrote a bit of code to reproduce the error on my local setup by creating an 
instance of OrcRecordWriter and writing large number of columns, I did get a 
similar heap space error, however it was going OOM while trying to flush the 
stripes, with this in the stacktrace:

at 
org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io/>.orc.WriterImpl.flushStripe(WriterImpl.java:2133)

This issue on the dev environment got resolved by setting

hive.exec.orc.default.buffer.size=32k

Will the same setting work for the original error?

For different reasons I cannot change the reducer memory or lower the buffer 
size even at a job level. For now, I am just trying to understand the source of 
this error. Can anyone please help?

Original OOM stacktrace:


FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : 
java.lang.OutOfMemoryError: Java heap space
        at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
        at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
        at 
org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io/>.orc.OutStream.getNewInputBuffer(OutStream.java:107)
        at 
org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io/>.orc.OutStream.write(OutStream.java:140)
        at 
com.google.protobuf.CodedOutputStream.refreshBuffer(CodedOutputStream.java:833)
        at 
com.google.protobuf.CodedOutputStream.flush(CodedOutputStream.java:843)
        at 
org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io/>.orc.WriterImpl.writeMetadata(WriterImpl.java:2327)
        at 
org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io/>.orc.WriterImpl.close(WriterImpl.java:2426)
        at 
org.apache.hadoop.hive.ql.io<http://org.apache.hadoop.hive.ql.io/>.orc.OrcNewOutputFormat$OrcRecordWriter.close(OrcNewOutputFormat.java:67)
        at 
org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.close(ReduceTask.java:550)
        at 
org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:629)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)



Reply via email to