Re: hive ORC wrong number of index entries error

Patrick Duin Thu, 24 Sep 2015 01:39:57 -0700

cool thanks, will try

2015-09-24 9:32 GMT+01:00 Prasanth Jayachandran <
pjayachand...@hortonworks.com>:


> With 650 columns you might need to reduce the compression buffer size to
> 8KB (may be try decreasing it fails or increasing it if it succeeds to find
> the right size) down from default 256KB. You can do that by setting
> orc.compress.size tblproperties.
>
> On Sep 24, 2015, at 3:27 AM, Patrick Duin <patd...@gmail.com> wrote:
>
> Thanks for the reply,
> My first thought was out of memory as well but the illegal argument
> exception happens before it is a separate entry in the log, The OOM
> exception is not the cause. So I am not sure where that OOM exception fits
> in. I've tried running it with more memory and got the same problem it was
> also consistently failing on the same split.
> We have about 650 columns. I don't know how many record writers are open
> (how can I see that?).
> I'll try running it with a reduced stripe size see if that helps.
> The weird thing is we have a production cluster that is running same
> hadoop/hive versions, same code and same data and processing just fine I
> get this error only in our QA cluster.
> It's hard to locate the difference :).
> Anyway thanks for the pointers I'll do some more digging.
>
> Cheers,
>  Patrick
>
> 2015-09-24 0:51 GMT+01:00 Prasanth Jayachandran <
> pjayachand...@hortonworks.com>:
>
>> Looks like you are running out of memory. Trying increasing the heap
>> memory or reducing the stripe size. How many columns are you writing? Any
>> idea how many record writers are open per map task?
>>
>> - Prasanth
>>
>> On Sep 22, 2015, at 4:32 AM, Patrick Duin <patd...@gmail.com> wrote:
>>
>> Hi all,
>>
>> I am struggling trying to understand a stack trace I am getting trying to
>> write an ORC file:
>> I am using hive-0.13.0/hadoop-2.4.0.
>>
>> 2015-09-21 09:15:44,603 INFO [main] org.apache.hadoop.mapred.MapTask: 
>> Ignoring exception during close for 
>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector@2ce49e21
>> java.lang.IllegalArgumentException: Column has wrong number of index entries 
>> found: 
>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry$Builder@6eeb967b 
>> expected: 1
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:578)
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1398)
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1780)
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2040)
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.OrcNewOutputFormat$OrcRecordWriter.close(OrcNewOutputFormat.java:67)
>>      at 
>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:647)
>>      at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1990)
>>      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:774)
>>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>>      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>      at java.security.AccessController.doPrivileged(Native Method)
>>      at javax.security.auth.Subject.doAs(Subject.java:415)
>>      at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
>>      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>> 2015-09-21 09:15:45,988 FATAL [main] org.apache.hadoop.mapred.YarnChild: 
>> Error running child : java.lang.OutOfMemoryError: Java heap space
>>      at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
>>      at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.OutStream.getNewOutputBuffer(OutStream.java:117)
>>      at org.apache.hadoop.hive.ql.io.orc.OutStream.spill(OutStream.java:168)
>>      at org.apache.hadoop.hive.ql.io.orc.OutStream.flush(OutStream.java:239)
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:583)
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StringTreeWriter.writeStripe(WriterImpl.java:1012)
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1400)
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1780)
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:2040)
>>      at 
>> org.apache.hadoop.hive.ql.io.orc.OrcNewOutputFormat$OrcRecordWriter.close(OrcNewOutputFormat.java:67)
>>      at 
>> org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:647)
>>      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
>>      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>>      at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>>      at java.security.AccessController.doPrivileged(Native Method)
>>      at javax.security.auth.Subject.doAs(Subject.java:415)
>>      at 
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
>>      at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
>>
>> I've seen https://issues.apache.org/jira/browse/HIVE-9080 and I think that 
>> might be related.
>>
>> I am not using hive though I am using a Map only job that writes to an 
>> OrcNewOutputFormat.class.
>>
>> Any pointers would be appreciated, anyone seen this before?
>>
>>
>> Thanks,
>>
>>  Patrick
>>
>>
>>
>
>

Re: hive ORC wrong number of index entries error

Reply via email to