I've seen examples from drill queries on HBase (all columns are strings)
access each column as a json object as in:

select t.column_a.xField, t.column_b.yField from some_hbase_table t

I'd like to do the same using my hive tables where I have production logging

My production log tables have the following schema

CREATE EXTERNAL TABLE logging_master_schema(
 line string
) partitioned by (month string, week string, day string);

each "line" is a log line in json, we use the partitions
to narrow down lines we crunch in hive queries where we
use fucntion calls like this get_json_object(line, '$.x.y')
to crack our logs.

A trivial example of a hive query would be:
SELECT get_json_object(line, '$.app.hcc.event_name'),
       get_json_object(line, '$.unixtime'),
       line FROM production_logs_hcc WHERE year='2014' and month='08' and
day='21'
       and get_json_object(line, '$.level') = 'EVENT'
  ORDER BY get_json_object(line, '$.unixtime')
       DESC;

I am trying to figure out now if there's a way we can use
drill to access hive and do the same kind of queries where
drill will treat the 'line' column as json as in this example:

select line.unixtime as ts,
       line.app.hcc.event_name as evt
from hive.production_logs_hcc
where year='2014' and month='08' and day='21'
      and line.level = 'EVENT'
      order by ts desc

so I started with the simplest test:

> select t.line from from hive.production_logs_hcc t --> works fine

> select t.line.level from from hive.production_logs_hcc t --> VARCHAR
assertion error

bummer, so it seams where a column in hbase is implicitly a json, a string
column
in hive is not AND no ammount of attempts to cast the log line to a varchar
and then
access as json makes any difference... (I tried this when I realized that
hive string
had no drill sql analog

Am I just out of luck here, hbase columns are json but hive text columns
aren't

Thanks for any help / explanation
_____________

john o schneider
[email protected]
408-203-7891

Reply via email to