Recently, bloom filter index is added to ORC which is much more accurate in row
group elimination than min/max based index.
Thanks
Prasanth
> On Jul 16, 2015, at 9:07 AM, Thomas Abeler wrote:
>
> Hey,
>
>
>
> i have an question about how indexing in ORC works
>
>
>
> The way I understo
Hi Sean
We are still in the process of moving ORC java code out of Apache Hive project.
The java code will eventually land in the Apache Orc project. At this point
only the C++ reader for ORC is in Apache Orc master.
Thanks
Prasanth
> On Aug 13, 2015, at 12:27 PM, Sean Luo wrote:
>
> Hi comm
On Aug 13, 2015, at 12:37 PM, Sean Luo wrote:
>
> But won't C++ Orc reader need underlying Java? Or the current Apache Orc C++
> code does not work?
>
>
>
> On Thursday, August 13, 2015 12:34 PM, Prasanth J
> wrote:
>
>
> Hi Sean
>
> We are stil
Oops.. missed the link
http://mail-archives.apache.org/mod_mbox/orc-user/201509.mbox/%3c560ab8d2.7070...@darose.net%3e
Thanks
Prasanth
> On Nov 10, 2015, at 1:10 PM, Prasanth J wrote:
>
> Please read this similar thread for more context on why S3 is slow. You
> should be usi
Please read this similar thread for more context on why S3 is slow. You should
be using newer s3a implementation which got huge performance gains over old s3n.
Thanks
Prasanth
> On Nov 10, 2015, at 1:05 PM, Alan Gates wrote:
>
> ORC does a lot of seeks inside its files in order to only load t
All complex types are flattened out and written as primitive column streams
(string, longs, double, float etc.). String columns are dictionary encoded. If
there are too many distinct keys then dictionary encoding will automatically be
turned off.
Thanks
Prasanth
> On Dec 1, 2016, at 11:35 AM,