[
https://issues.apache.org/jira/browse/HIVE-6347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109350#comment-15109350
]
Gopal V commented on HIVE-6347:
-------------------------------
{code}
Caused by: java.lang.UnsatisfiedLinkError:
org.apache.hadoop.io.compress.snappy.SnappyDecompressor.decompressBytesDirect()I
{code}
Does look like the tasks are not able to locate the libhadoop.so binary (*or*
the hadoop build was done without snappy-dev available).
Zero copy readers don't work if libhadoop.so is missing anyway.
But to fill in some later developments, turning on zerocopy=true needs
cluster-wide configs to turn on YARN-1775.
The YARN memory counting counts memory-mapped files as container memory, so
without that change you might see containers being killed for using too much
memory as you scale past the terabyte levels.
> ZeroCopy read path for ORC RecordReader
> ---------------------------------------
>
> Key: HIVE-6347
> URL: https://issues.apache.org/jira/browse/HIVE-6347
> Project: Hive
> Issue Type: Bug
> Components: File Formats
> Affects Versions: tez-branch
> Reporter: Gopal V
> Assignee: Gopal V
> Fix For: tez-branch
>
> Attachments: HIVE-6347.1.patch, HIVE-6347.2-tez.patch,
> HIVE-6347.3-tez.patch, HIVE-6347.4-tez.patch, HIVE-6347.5-tez.patch
>
>
> ORC can use the new HDFS Caching APIs and the ZeroCopy readers to avoid extra
> data copies into memory while scanning files.
> Implement ORC zcr codepath and a hive.orc.zerocopy flag.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)