Ok so I'm trying to create an external table and load a delimited
file into it, then just do a basic select out of it, here is a
description of my scenario along with steps and results I took.
Hopefully someone can help me figure out what I'm doing wrong.
# Sample.tab
1227422134|2|1|paid:44519,tax:2120,value:42399
# CREATE TABLE
hive> CREATE EXTERNAL TABLE activity_test
> (occurred_at INT, actor_id INT, actee_id INT, properties
MAP<STRING, STRING>)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY "|"
> COLLECTION ITEMS TERMINATED BY ","
> MAP KEYS TERMINATED BY ":"
> LOCATION '/data/sample';
OK
# LOAD DATA
hive> LOAD DATA LOCAL INPATH '/Users/josh/Hive/sample.tab' INTO TABLE
activity_test;
Copying data from file:/Users/josh/Hive/sample.tab
Loading data to table activity_test
OK
# SELECT OVERWRITE DIRECTORY
hive> FROM activity_test INSERT OVERWRITE DIRECTORY '/data/output'
SELECT activity_test.occurred_at, activity_test.actor_id,
activity_test.actee_id, activity_test.properties;
Total MapReduce jobs = 1
Starting Job = job_200811250653_0018, Tracking URL = http://{clipped}:
50030/jobdetails.jsp?jobid=job_200811250653_0018
Kill Command = /Users/josh/Hadoop/bin/hadoop job -
Dmapred.job.tracker={clipped}:54311 -kill job_200811250653_0018
map = 0%, reduce =0%
map = 50%, reduce =0%
map = 100%, reduce =0%
Ended Job = job_200811250653_0018
Moving data to: /data/output
OK
Time taken: 72.329 seconds
$ hadoop fs -cat /data/output/*
012{}
This obviously isn't the correct output, and are just some default
values for those columns, what am I doing wrong?
Thanks
Josh Ferguson