Digging a little more, I think there must be an error in the convert_from()
function or what ever is passing my line to it
the error

Error: SYSTEM ERROR: IllegalArgumentException: You tried to write a BigInt
> type when you are using a ValueWriter of type NullableVarCharWriterImpl.
> Fragment 0:0


seems to be speaking to the conversion of the json which looks like this:

>
> {"level":"EVENT","app":"app_user_info":"session":null,"user":null,"user_agent":"Pingdom.com_bot_version_1.4_(
> http://www.pingdom.com/)"},"app_name":"hcc","hc":{"event_name":"logout","frontend":{"logout":{"username":null,"message":"User
> is logging
> out"}},"component_name":"frontend"}},"isotime":"2015-07-29T00:00:48.4
> 24267+00:00","source":"account.views","host":"hcc-demo-prd1.apixio.com
> ","client":"184.75.210.186","time":"1438128048424"}


if i read these source log files in directly using

0: jdbc:drill:drillbit=drill.apixio.com> SELECT
distinct(log.app.app_user_info.user_agent) FROM
dfs.`user`.`/logmaster/production/hcc/2015-07-28/*` log ;

I get expected results, no issue parsing the json

Some more verbose tracing would help me debug this issue, does any exist?

/jos



_____________

john o schneider
[email protected]
408-203-7891


On Thu, Jul 30, 2015 at 12:06 AM, John Schneider <[email protected]>
wrote:

>
> Jacques - thanks for your reply. It is very much appreciated
>
> I was able to create a view but when accessing the one column in the view
> I got an IllegalArgumentException. I tried a casting values but keep
> getting back to this error.
>
> One thing I saw though is perhaps not the issue is that when I describe
> the view, the line I converted to JSON is an ANY
> My transcript below:
>
>
> CREATE VIEW logView AS SELECT CONVERT_FROM(line, 'JSON') AS line FROM
> hive.production_logs_hcc_24
>
> 0: jdbc:drill:drillbit=drill.apixio.com> use dfs.`tmp`;
> +-------+--------------------------------------+
> |  ok   |               summary                |
> +-------+--------------------------------------+
> | true  | Default schema changed to [dfs.tmp]  |
> +-------+--------------------------------------+
> 1 row selected (0.353 seconds)
> 0: jdbc:drill:drillbit=drill.apixio.com> CREATE VIEW logView AS SELECT
> CONVERT_FROM(line, 'JSON') AS line FROM hive.production_logs_hcc_24;
> +-------+----------------------------------------------------------+
> |  ok   |                         summary                          |
> +-------+----------------------------------------------------------+
> | true  | View 'logView' created successfully in 'dfs.tmp' schema  |
> +-------+----------------------------------------------------------+
> 1 row selected (0.272 seconds)
>
> 0: jdbc:drill:drillbit=drill.apixio.com> describe logView;
> +--------------+------------+--------------+
> | COLUMN_NAME  | DATA_TYPE  | IS_NULLABLE  |
> +--------------+------------+--------------+
> | line         | ANY        | YES          |
> +--------------+------------+--------------+
> 1 row selected (0.125 seconds)
>
> 0: jdbc:drill:drillbit=drill.apixio.com> select t.line.level from logView
> t limit 10;
> Error: SYSTEM ERROR: IllegalArgumentException: You tried to write a BigInt
> type when you are using a ValueWriter of type NullableVarCharWriterImpl.
>
> Fragment 0:0
>
> [Error Id: 655d0dbd-4d8f-433f-96a6-7ab85efb0b8b on
> hadoop-data-drill:31010] (state=,code=0)
> 0: jdbc:drill:drillbit=drill.apixio.com> select t.line from logView t
> limit 10;
> Error: SYSTEM ERROR: IllegalArgumentException: You tried to write a BigInt
> type when you are using a ValueWriter of type NullableVarCharWriterImpl.
>
> Fragment 0:0
>
> [Error Id: fee663a6-b75c-414c-9d47-39b1e2437558 on
> hadoop-data-drill:31010] (state=,code=0)
>
>
> _____________
>
> john o schneider
> [email protected]
> 408-203-7891
>
>
> On Wed, Jul 29, 2015 at 12:20 PM, Jacques Nadeau <[email protected]>
> wrote:
>
>> Drill doesn't implicitly cast from VarChar -> Map.  However, just create a
>> Drill view:
>>
>> CREATE VIEW newView AS SELECT CONVERT_FROM(line, 'JSON') AS line FROM
>> hive.production_logs_hcc
>>
>> Then just query newView.
>>
>> Then 'select t.line.level from newView' should work just fine.
>>
>>
>> --
>> Jacques Nadeau
>> CTO and Co-Founder, Dremio
>>
>> On Wed, Jul 29, 2015 at 12:02 PM, John Schneider <[email protected]>
>> wrote:
>>
>> > I've seen examples from drill queries on HBase (all columns are strings)
>> > access each column as a json object as in:
>> >
>> > select t.column_a.xField, t.column_b.yField from some_hbase_table t
>> >
>> > I'd like to do the same using my hive tables where I have production
>> > logging
>> >
>> > My production log tables have the following schema
>> >
>> > CREATE EXTERNAL TABLE logging_master_schema(
>> >  line string
>> > ) partitioned by (month string, week string, day string);
>> >
>> > each "line" is a log line in json, we use the partitions
>> > to narrow down lines we crunch in hive queries where we
>> > use fucntion calls like this get_json_object(line, '$.x.y')
>> > to crack our logs.
>> >
>> > A trivial example of a hive query would be:
>> > SELECT get_json_object(line, '$.app.hcc.event_name'),
>> >        get_json_object(line, '$.unixtime'),
>> >        line FROM production_logs_hcc WHERE year='2014' and month='08'
>> and
>> > day='21'
>> >        and get_json_object(line, '$.level') = 'EVENT'
>> >   ORDER BY get_json_object(line, '$.unixtime')
>> >        DESC;
>> >
>> > I am trying to figure out now if there's a way we can use
>> > drill to access hive and do the same kind of queries where
>> > drill will treat the 'line' column as json as in this example:
>> >
>> > select line.unixtime as ts,
>> >        line.app.hcc.event_name as evt
>> > from hive.production_logs_hcc
>> > where year='2014' and month='08' and day='21'
>> >       and line.level = 'EVENT'
>> >       order by ts desc
>> >
>> > so I started with the simplest test:
>> >
>> > > select t.line from from hive.production_logs_hcc t --> works fine
>> >
>> > > select t.line.level from from hive.production_logs_hcc t --> VARCHAR
>> > assertion error
>> >
>> > bummer, so it seams where a column in hbase is implicitly a json, a
>> string
>> > column
>> > in hive is not AND no ammount of attempts to cast the log line to a
>> varchar
>> > and then
>> > access as json makes any difference... (I tried this when I realized
>> that
>> > hive string
>> > had no drill sql analog
>> >
>> > Am I just out of luck here, hbase columns are json but hive text columns
>> > aren't
>> >
>> > Thanks for any help / explanation
>> > _____________
>> >
>> > john o schneider
>> > [email protected]
>> > 408-203-7891
>> >
>>
>
>

Reply via email to