location of the directory you wish to add, your
> data will be added correctly.
>
> ** **
>
> pat
>
> ** **
>
> *From:* Kennon Lee [mailto:ken...@brooklynpacket.com]
> *Sent:* Wednesday, June 29, 2011 1:27 PM
>
> *To:* user@hive.apache.org
> *Subject:* Re: loading
tion load.
tl;dr?
If you specifiy the full location of the directory you wish to add, your data
will be added correctly.
pat
From: Kennon Lee [mailto:ken...@brooklynpacket.com]
Sent: Wednesday, June 29, 2011 1:27 PM
To: user@hive.apache.org
Subject: Re: loading datafiles in s3
Thanks for
Thanks for the responses. Regarding the first question, I wasnt sure what
you meant by using ALTER TABLE statements to allow for non-prefixed
directory names. Don't you still have to name the directories with the
'blah=' part? For instance, if we do:
ALTER TABLE foo ADD PARTITION (dt='2011-06-29')
I think the answer to 1 is No but you can confirm on the AWS EMR forum.
The problem I've been having is that if you have x=foo in the prefix of your
S3 path, EMR will try to use it as part of your partitioning key even if you
don't want it.
Say, x=foo/y=bar/data and you want to partition on y only
allo,
1 dunno. I generate my EMR scripts in a separate script so generating a stack
of 'alter table...' queries is easy for me
2 event_b will have a null value in column 4.
2 b ( you didn't ask) what happens with this row:
event_c user_id france 500 afifthcolumn
afifthcolumn will be truncate