Still problems. I'm trying the ALTER syntax. On Tue, May 29, 2012 at 2:27 PM, Balaji Rao <[email protected]> wrote:
> the location should be 's3://' and not 's3n://' > > On Tue, May 29, 2012 at 5:19 PM, Russell Jurney > <[email protected]> wrote: > > Ok, I spoke too soon. Same error. Crapola. Still working on it. > > > > > > On Tue, May 29, 2012 at 2:19 PM, Russell Jurney < > [email protected]> > > wrote: > >> > >> I get an error when I create an external table. btw - I can partition > on > >> dt or from/to address. I'm just not clear on how to partition - my > efforts > >> fail. > >> > >> hive> create external table from_to(from_address string, to_address > >> string, dt string) > >> > row format delimited fields terminated by '\t' stored as > >> textfile location 's3n://rjurney_public_web/from_to_date'; > >> FAILED: Error in metadata: java.lang.IllegalArgumentException: Invalid > >> hostname in URI s3n://rjurney_public_web/from_to_date > >> FAILED: Execution Error, return code 1 from > >> org.apache.hadoop.hive.ql.exec.DDLTask > >> > >> > >> However, I just upgraded to HIVE 0.9, and it works :) No reason to use > >> the old stuff when I can scp the new one up. > >> > >> Thanks! > >> > >> On Tue, May 29, 2012 at 1:34 PM, Balaji Rao <[email protected]> > wrote: > >>> > >>> If you are using hive on EMR, you can create a table directly from the > >>> data on S3: > >>> > >>> From hive, you can create tables that use S3 data like this: > >>> > >>> create external table from_to(from_address string, to_address string, > >>> dt string) row format delimited fields terminated by '\t' stored as > >>> textfile location 's3://rjurney_public_web/from_to_date'; > >>> > >>> You could then: > >>> select <*> from from_to > >>> > >>> Balaji > >>> > >>> On Tue, May 29, 2012 at 4:20 PM, Russell Jurney > >>> <[email protected]> wrote: > >>> > How do I load data from S3 into Hive using Amazon EMR? I've booted a > >>> > small > >>> > cluster, and I want to load a 3-column TSV file from Pig into a table > >>> > like > >>> > this: > >>> > > >>> > create table from_to (from_address string, to_address string, dt > >>> > string); > >>> > > >>> > > >>> > When I run something like this: > >>> > > >>> > load data inpath 's3n://rjurney_public_web/from_to_date' into table > >>> > from_to; > >>> > > >>> > > >>> > I get errors: > >>> > > >>> > FAILED: Error in semantic analysis: Line 1:17 Invalid path > >>> > 's3n://rjurney_public_web/from_to_date': only "file" or "hdfs" file > >>> > systems > >>> > accepted. s3n file system is not supported. > >>> > > >>> > > >>> > There is no distcp on the master node of my EMR cluster, so I can't > >>> > copy it > >>> > over. I've read the documentation... and so far after a day of > trying, > >>> > I > >>> > can't load data into HIVE via EMR. > >>> > > >>> > What am I missing? Thanks! > >>> > -- > >>> > Russell > >>> > Jurney twitter.com/rjurney [email protected] datasyndrome.com > >> > >> > >> > >> > >> -- > >> Russell > >> Jurney twitter.com/rjurney [email protected] datasyndrome.com > > > > > > > > > > -- > > Russell Jurney twitter.com/rjurney [email protected] > datasyndrome.com > -- Russell Jurney twitter.com/rjurney [email protected] datasyndrome.com
