Thanks guys! Appreciate the pointers. On Thu, Apr 13, 2017 at 3:15 PM, rahul challapalli < [email protected]> wrote:
> What you need is a format plugin. You can take a look at the Text Format > plugin while reading paul's documentation which abhishek already shared. > Don't look at parquet as it is more complicated. A short summary of what > you need : (maybe too short to be any useful :) ) > > 1. A group of classes which make drill recognize your format plugin. > 2. An ORC Reader. This will the heart of this project. Essentially you > provide a way to read data(columns) from ORC files and then populate > drill's value vectors. You can later enhance this by parallelizing the > reads of individual columns. > 3. Once you have the format plugin working, you might want to start playing > with planner rules if you want features like "filter pushdown into the > scan" etc. > > - Rahul > > On Apr 13, 2017 2:57 PM, "Manoj Murumkar" <[email protected]> > wrote: > > Thanks. I knew about the hive table format support. I'll look into reading > directly from orc files on hdfs (a la parquet). Is there some documentation > around how to develop a new storage plugin? > > > On Apr 13, 2017, at 2:51 PM, Abhishek Girish <[email protected]> wrote: > > > > Drill does not support ORC as a DFS file format. You are welcome to > > contribute. As a workaround, Drill supports reading ORC files via the > Hive > > plugin, so you should be able use that. > > > > On Thu, Apr 13, 2017 at 2:19 PM, Manoj Murumkar < > [email protected]> > > wrote: > > > >> Hi! > >> > >> I am wondering if someone is actively working on ORC support already. > >> Appreciate any pointers. > >> > >> Thanks, > >> > >> Manoj > >> >
