Hi Owen,

Thanks for the response. I saw that DirectDecompressor will be used if
available and the difference was only in compression.
Keeping in mind what you said, I looked at the code again. I see that the
only specific piece that ORC uses is "nowrap" = true in Deflater. As far as
I understand from the description, it should directly correspond
to CompressionHeader.NO_HEADER in ZlibCompressor. In this case,
ZlibCompressor with the right setup can be a replacement for Deflater. What
do you think?

Aleksei

*Aleksei Statkevich *| Engineering Manager

<http://www.google.com/url?q=http%3A%2F%2Frocketfuel.com%2F&sa=D&sntz=1&usg=AFrqEzfAQ9xih8SV05CiYtvyyIAKLzpX2g>

<https://www.google.com/url?q=https%3A%2F%2Ftwitter.com%2Frocketfuelinc&sa=D&sntz=1&usg=AFrqEzdmS-VfAbRejUE27Yrsp6UaaAoUdw>

<https://www.google.com/url?q=https%3A%2F%2Fwww.facebook.com%2Frocketfuelinc%2F&sa=D&sntz=1&usg=AFrqEzc8zstBb-QJdiYqd7m9Wmmt-UHs7A>

<https://www.google.com/url?q=https%3A%2F%2Fwww.instagram.com%2Frocketfuellife%2F&sa=D&sntz=1&usg=AFrqEzf8veiDVVhTCQnpUnRttXonn6y9-g>

<https://www.google.com/url?q=https%3A%2F%2Fwww.linkedin.com%2Fcompany%2Frocket-fuel-inc-&sa=D&sntz=1&usg=AFrqEzcvsj2bSqJ_SYc8qpQWQJnXXEjvLQ>

<https://www.google.com/url?q=https%3A%2F%2Fwww.glassdoor.com%2FOverview%2FWorking-at-Rocket-Fuel-EI_IE286428.11%2C22.htm&sa=D&sntz=1&usg=AFrqEzf6IUelwlAKdidiiJ3wTFdjnigQVg>

On Thu, Jun 23, 2016 at 2:35 PM, Owen O'Malley <[email protected]> wrote:

>
>
> On Fri, Jun 17, 2016 at 11:31 PM, Aleksei Statkevich <
> [email protected]> wrote:
>
>> Hello,
>>
>> I recently looked at ORC encoding and noticed
>> that hive.ql.io.orc.ZlibCodec uses java's java.util.zip.Deflater and not
>> Hadoop's native ZlibCompressor.
>>
>> Can someone please tell me what is the reason for it?
>>
>
> It is more subtle than that. The first piece to notice is that if your
> Hadoop has the direct decompression
> (org.apache.hadoop.io.compress.zlib.ZlibDirectDecompressor), it will be
> used. The reason that the ZlibCompressor isn't used is because ORC needs a
> different API. In particular, ORC doesn't use stream compression, but
> rather block compression. That is done so that it can jump over compression
> blocks for predicate push down. (If you are skipping over a lot of values,
> ORC doesn't need to decompress the bytes.)
>
> .. Owen
>
>
>
>>
>> Also, how does performance of Deflater (which also uses native
>> implementation) compare to Hadoop's native zlib implementation?
>>
>> Thanks,
>> Aleksei
>>
>>
>

Reply via email to