select t.app.hcc.event_name as en
from dfs.`user`.`/logmaster/production/hcc/2015-07-30/*.json` t
where en in ('logout');
this yields the error:
Error: SYSTEM ERROR: NumberFormatException: logout
ok, so let's explicitly cast
select cast(convert_from(t.app.hcc.event_name, 'UTF8') as varchar(30)) as
en
from dfs.`user`.`/logmaster/production/hcc/2015-07-30/*.json` t
where en in ('logout');
now, just to humor drill
select cast(convert_from(t.app.hcc.event_name, 'UTF8') as varchar(30)) as
en
from dfs.`user`.`/logmaster/production/hcc/2015-07-30/*.json` t
where en in ('123');
runs, but returns no results - as would be expected because we don't use #s
as event names
Am I misunderstanding how drill types data in a schema less record?
I would have thought the explicit cast would have been enough
P.S. I ran another query like this one on a months worth of logs (a lot of
json in HDFS) and it chewed through it in less time than it takes my
current Hive query to actually start, and all of this on a single aws
m3.xlarge - this drill sucker is fast, we really want to use it.
john o schneider
[email protected]
408-203-7891