Re: Trouble Loading Into External Table

Pete Wyckoff Wed, 26 Nov 2008 11:27:49 -0800

Can you send the new output of describe extended?

Thanks, pete



On 11/26/08 6:46 AM, "Ashish Thusoo" <[EMAIL PROTECTED]> wrote:

congrats!! :)

please do file a JIRA for this. We will fix this as soon as possible..

Ashish

________________________________________
From: Josh Ferguson [EMAIL PROTECTED]
Sent: Wednesday, November 26, 2008 12:44 AM
To: [email protected]
Subject: Re: Trouble Loading Into External Table

I got it! The combination I got to work was the following:

CREATE EXTERNAL TABLE basic
(actor_id INT, actee_id INT, properties MAP<STRING, STRING>)
ROW FORMAT DELIMITED
COLLECTION ITEMS TERMINATED BY '44'
MAP KEYS TERMINATED BY '58'
STORED AS TEXTFILE
LOCATION '/data/sample2';

This forced a DynamicSerde/TCTLSeperatedProtocol combo which didn't
work properly when I specified the field delimiter but works fine
when you only specify the other two for some reason and leave the
field delimiter as the default, I should probably file it in JIRA.

Josh Ferguson

On Nov 25, 2008, at 7:09 PM, Joydeep Sen Sarma wrote:

>
> Can you please send the output of 'describe extended
> activity_test'. This will help us understand what's happening with
> all the create table parameters.
>
> Also - as a sanity check - can you please check hadoop dfs -cat  /
> data/sample/* (to make sure data got loaded/moved into that dir)
>
> -----Original Message-----
> From: Josh Ferguson [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, November 25, 2008 7:03 PM
> To: [email protected]
> Subject: Re: Trouble Loading Into External Table
>
> hive> CREATE EXTERNAL TABLE activity_test
>> (occurred_at INT, actor_id INT, actee_id INT, properties
> MAP<STRING, STRING>)
>> ROW FORMAT DELIMITED
>> FIELDS TERMINATED BY '124'
>> COLLECTION ITEMS TERMINATED BY '44'
>> MAP KEYS TERMINATED BY '58'
>> LINES TERMINATED BY '10'
>> STORED AS TEXTFILE
>> LOCATION '/data/sample';
> OK
>
> hive> LOAD DATA LOCAL INPATH '/Users/josh/Hive/sample.tab' INTO TABLE
> activity_test;
> Copying data from file:/Users/josh/Hive/sample.tab
> Loading data to table activity_test
> OK
>
> $ hadoop fs -cat /data/sample/sample.tab
> 1227422134|2|1|paid:44519,tax:2120,value:42399
>
> hive> FROM activity_test INSERT OVERWRITE DIRECTORY '/data/output2'
> SELECT activity_test.occurred_at, activity_test.actor_id,
> activity_test.actee_id, activity_test.properties;
> Total MapReduce jobs = 1
> Starting Job = job_200811250653_0022, Tracking URL = http://{clipped}:
> 50030/jobdetails.jsp?jobid=job_200811250653_0022
> Kill Command = /Users/josh/Hadoop/bin/hadoop job  -
> Dmapred.job.tracker={clipped}:54311 -kill job_200811250653_0022
>   map = 0%,  reduce =0%
>   map = 50%,  reduce =0%
>   map = 100%,  reduce =0%
> Ended Job = job_200811250653_0022
> Moving data to: /data/output2
> OK
>
> $ hadoop fs -cat /data/output2/*
> 012{}
>
> Still getting incorrect results, is there anything else I could try?
>
> Josh Ferguson
>
> On Nov 25, 2008, at 6:34 PM, Ashish Thusoo wrote:
>
>> Can you try putting the ascii value within quotes, so for example
>> FIELDS TERMINATED BY '124' etc...
>>
>> You can also look at the following file in the source to see an
>> example of how this is done
>>
>> ql/src/test/queries/clientpositive/input_dynamicserde.q
>>
>> Ashish
>>
>> -----Original Message-----
>> From: Josh Ferguson [mailto:[EMAIL PROTECTED]
>> Sent: Tuesday, November 25, 2008 6:18 PM
>> To: [email protected]
>> Subject: Trouble Loading Into External Table
>>
>> Ok so I'm trying to create an external table and load a delimited
>> file into it, then just do a basic select out of it, here is a
>> description of my scenario along with steps and results I took.
>> Hopefully someone can help me figure out what I'm doing wrong.
>>
>> # Sample.tab
>>
>> 1227422134|2|1|paid:44519,tax:2120,value:42399
>>
>> # CREATE TABLE
>>
>> hive> CREATE EXTERNAL TABLE activity_test
>>> (occurred_at INT, actor_id INT, actee_id INT, properties
>>> MAP<STRING, STRING>)
>>> ROW FORMAT DELIMITED
>>> FIELDS TERMINATED BY "|"
>>> COLLECTION ITEMS TERMINATED BY ","
>>> MAP KEYS TERMINATED BY ":"
>>> LOCATION '/data/sample';
>> OK
>>
>> # LOAD DATA
>>
>> hive> LOAD DATA LOCAL INPATH '/Users/josh/Hive/sample.tab' INTO TABLE
>> activity_test;
>> Copying data from file:/Users/josh/Hive/sample.tab Loading data to
>> table activity_test OK
>>
>> # SELECT OVERWRITE DIRECTORY
>>
>> hive> FROM activity_test INSERT OVERWRITE DIRECTORY '/data/output'
>> SELECT activity_test.occurred_at, activity_test.actor_id,
>> activity_test.actee_id, activity_test.properties; Total MapReduce
>> jobs = 1 Starting Job = job_200811250653_0018, Tracking URL =
>> http://{clipped}:
>> 50030/jobdetails.jsp?jobid=job_200811250653_0018
>> Kill Command = /Users/josh/Hadoop/bin/hadoop job  -
>> Dmapred.job.tracker={clipped}:54311 -kill job_200811250653_0018
>>   map = 0%,  reduce =0%
>>   map = 50%,  reduce =0%
>>   map = 100%,  reduce =0%
>> Ended Job = job_200811250653_0018
>> Moving data to: /data/output
>> OK
>> Time taken: 72.329 seconds
>>
>> $ hadoop fs -cat /data/output/*
>> 012{}
>>
>> This obviously isn't the correct output, and are just some default
>> values for those columns, what am I doing wrong?
>>
>> Thanks
>>
>> Josh Ferguson
>

Re: Trouble Loading Into External Table

Reply via email to