[
https://issues.apache.org/jira/browse/HIVE-6347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108883#comment-15108883
]
Demeter Sztanko commented on HIVE-6347:
---------------------------------------
Hello,
I am running my cluster on Hadoop 2.7.1 (checksum
fc0a1a23fc1868e4d5ee7fa2b28a58a), using Hive 1.2.1 (checksum
ab480aca41b24a9c3751b8c023338231) and hive.exec.orc.zerocopy tends to cause
failures.
All my hive queries run fines, but once I enable zerocopy, it seems to have
some problems with native libraries:
{code}
set hive.exec.orc.zerocopy = true;
<execute my query>
Hadoop job information for Stage-1: number of mappers: 316; number of reducers:
90
2016-01-20 16:37:54,479 Stage-1 map = 0%, reduce = 0%
2016-01-20 16:38:21,061 Stage-1 map = 100%, reduce = 100%
2016-01-20 16:39:21,246 Stage-1 map = 100%, reduce = 100%
Ended Job = job_1452780282075_23380 with errors
Error during job, obtaining debugging information...
Diagnostic Messages for this Task:
Error: java.io.IOException: java.lang.reflect.InvocationTargetException
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
at
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.<init>(HadoopShimsSecure.java:213)
at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:333)
at
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:169)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:432)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
... 11 more
Caused by: java.lang.UnsatisfiedLinkError:
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect()I
at
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect(Native
Method)
at
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressDirect(SnappyDecompressor.java:305)
at
org.apache.hadoop.io.compress.snappy.SnappyDecompressor$SnappyDirectDecompressor.decompress(SnappyDecompressor.java:341)
at
org.apache.hadoop.hive.shims.ZeroCopyShims$DirectDecompressorAdapter.decompress(ZeroCopyShims.java:101)
at
org.apache.hadoop.hive.ql.io.orc.SnappyCodec.directDecompress(SnappyCodec.java:100)
at
org.apache.hadoop.hive.ql.io.orc.SnappyCodec.decompress(SnappyCodec.java:67)
at
org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:214)
at
org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:227)
at
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:54)
at
org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:302)
at
org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringDictionaryTreeReader.readDictionaryLengthStream(TreeReaderFactory.java:1674)
at
org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringDictionaryTreeReader.startStripe(TreeReaderFactory.java:1654)
at
org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StringTreeReader.startStripe(TreeReaderFactory.java:1382)
at
org.apache.hadoop.hive.ql.io.orc.TreeReaderFactory$StructTreeReader.startStripe(TreeReaderFactory.java:2040)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:795)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:986)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:1019)
at
org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:205)
at
org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:539)
at
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat$VectorizedOrcRecordReader.<init>(VectorizedOrcInputFormat.java:71)
at
org.apache.hadoop.hive.ql.io.orc.VectorizedOrcInputFormat.getRecordReader(VectorizedOrcInputFormat.java:156)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createVectorizedReader(OrcInputFormat.java:1088)
at
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1102)
at
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:67)
... 16 more
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
{code}
> ZeroCopy read path for ORC RecordReader
> ---------------------------------------
>
> Key: HIVE-6347
> URL: https://issues.apache.org/jira/browse/HIVE-6347
> Project: Hive
> Issue Type: Bug
> Components: File Formats
> Affects Versions: tez-branch
> Reporter: Gopal V
> Assignee: Gopal V
> Fix For: tez-branch
>
> Attachments: HIVE-6347.1.patch, HIVE-6347.2-tez.patch,
> HIVE-6347.3-tez.patch, HIVE-6347.4-tez.patch, HIVE-6347.5-tez.patch
>
>
> ORC can use the new HDFS Caching APIs and the ZeroCopy readers to avoid extra
> data copies into memory while scanning files.
> Implement ORC zcr codepath and a hive.orc.zerocopy flag.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)