Thanks guys! Appreciate the pointers.

On Thu, Apr 13, 2017 at 3:15 PM, rahul challapalli <
[email protected]> wrote:

> What you need is a format plugin. You can take a look at the Text Format
> plugin while reading paul's documentation which abhishek already shared.
> Don't look at parquet as it is more complicated. A short summary of what
> you need : (maybe too short to be any useful :) )
>
> 1. A group of classes which make drill recognize your format plugin.
> 2. An ORC Reader. This will the heart of this project. Essentially you
> provide a way to read data(columns) from ORC files and then populate
> drill's value vectors. You can later enhance this by parallelizing the
> reads of individual columns.
> 3. Once you have the format plugin working, you might want to start playing
> with planner rules if you want features like "filter pushdown into the
> scan" etc.
>
> - Rahul
>
> On Apr 13, 2017 2:57 PM, "Manoj Murumkar" <[email protected]>
> wrote:
>
> Thanks. I knew about the hive table format support. I'll look into reading
> directly from orc files on hdfs (a la parquet). Is there some documentation
> around how to develop a new storage plugin?
>
> > On Apr 13, 2017, at 2:51 PM, Abhishek Girish <[email protected]> wrote:
> >
> > Drill does not support ORC as a DFS file format. You are welcome to
> > contribute. As a workaround, Drill supports reading ORC files via the
> Hive
> > plugin, so you should be able use that.
> >
> > On Thu, Apr 13, 2017 at 2:19 PM, Manoj Murumkar <
> [email protected]>
> > wrote:
> >
> >> Hi!
> >>
> >> I am wondering if someone is actively working on ORC support already.
> >> Appreciate any pointers.
> >>
> >> Thanks,
> >>
> >> Manoj
> >>
>

Reply via email to