Hi Amandeep,

I've put up a (hacked) avro reader udf at https://github.com/dremio/udfs.

You can edit AvroDirectRecordReader.java and change DEFAULT_SCHEMA to your
avro schema.

Once you added avro schema, follow drill's steps
<https://drill.apache.org/docs/develop-custom-functions-introduction/)> to
install udf.

1) Copy jars (drill-avro-udfs-1.0-SNAPSHOT.jar and
drill-avro-udfs-1.0-SNAPSHOT-sources.jar) to jars/3rdparty/
2) Restart drillbit/s.
3) Use function avro_parse to parse column from hbase.
    e.g:
        SELECT avro_parse(T.CF1.Q1) from hbase.`mytable` T limit 1;

~ Amit.





On Tue, Sep 29, 2015 at 10:54 AM, Jason Altekruse <[email protected]>
wrote:

> If you are serializing data defined with Avro schemas into HBase then what
> Rahul said is correct, we should be able to read the columns as they were
> mapped into HBase with no special setup.
>
> If you are instead saying that you have Avro binary blobs stored in one or
> more of your HBase columns, then what you are going to need is a UDF to
> take the binary data and process into the Drill record structure. This is
> possible because Drill can not only read complex (nested or repeated) data,
> but it also has full support throughout the engine to manipulate it, so any
> function can take in or produce a complex data value. The current best
> example of this is our convert_from function for JSON. This function can
> take JSON data stored in a varchar value and produce a complex value
> containing maps and lists.
>
> This code [1] is actually really short because it calls into our JSON
> parser which is shared with the code used to actually read JSON out of
> files. It does show you how to set up such a function to produce a complex
> output. Looking at the function that is called here
> (jsonReader.write(writer))
> will show you how the JSON reader can map from the data produced by the
> Jackson JSON parser into Drill records. You could do something similar with
> the Avro reader to get access to data stored in a binary column.
>
> [1]
>
> https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/conv/JsonConvertFrom.java
>
> On Tue, Sep 29, 2015 at 9:33 AM, rahul challapalli <
> [email protected]> wrote:
>
> > Once you serialized your avro data into hbase, then avro should no longer
> > come into picture. Now your table is just a normal hbase table. You can
> > refer to the below documentation on querying hbase tables
> >
> > https://drill.apache.org/docs/querying-hbase/
> >
> > - Rahul
> >
> > On Tue, Sep 29, 2015 at 12:14 AM, Amandeep Singh <
> [email protected]>
> > wrote:
> >
> > > Hi,
> > >
> > > I need to use sql queries as supported by drill to fetch data from
> hbase
> > > which is stored in avro serialized format having predefined schema
> > > definition.
> > > Please suggest a way for the same.
> > >
> > > Regards,
> > > Amandeep Singh
> > >
> >
>

Reply via email to