Re: Querying protocol buffers files with Drill

Cristian Espinoza Tue, 09 Sep 2014 09:34:23 -0700

Many thanks to Yash and Jason for your answers about this. I will explore
two alternatives for now:


   - Using Hive as proposed by Jason. I'll read more on elephant-bird to
   check the way to import protobuf data into Hive.
   - I'm also reading about ways to convert protobuf files to parquet, a
   format Drill is able to use as a datasource. I believe I can do this using
   parquet-mr (https://github.com/Parquet/parquet-mr).

Cristian


> Hi Christian,
>
>While we do not have a native protobuf reader for Drill, we do support Hive
>Serdes as an input format. This will not be the fastest way to get your
>data into the Drill engine, but it should be less coding than writing a
>record reader for drill.
>
>If you need performance and are up for learning a bit more about Drill, we
>would certainly welcome a contribution of a protobuf reader and would be
>happy to help you get started.
>
>-Jason Altekruse


On Wed, Sep 3, 2014 at 10:58 AM, Yash Sharma <[email protected]> wrote:

> Hey Cristian, currently we do not have  protobuf readers in Drill. It would
> however be possible to add new readers in Drill by creating new
> RecordReaders.
>
> Yash.




On Wed, Sep 3, 2014 at 1:09 PM, Cristian Espinoza <
[email protected]> wrote:

> Hi,
>
> I'm evaluating Drill and until now it looks great. My idea is to use it to
> directly query some protocol buffers files so they appear to the rest of my
> JEE app as a datasource. But I've been unable to find any information in
> the documentation about the proper way to register the file system,
> specifically the format I have to use. The docs present examples for csv,
> json and parquet formats, but there's none about protobuf.
>
> Is this possible to do? According to Drill's description it may be.
>
> Many thanks in advance,
>
> Cristián Espinoza
>
>
>

Re: Querying protocol buffers files with Drill

Reply via email to