Re: External partition table question

2014-07-17 Thread Lefty Leverenz
Thanks for this clarification. I've revised the Add Partitions section in the wiki accordingly. -- Lefty On Fri, Jul 18, 2014 at 12:45 AM, Satish Mittal wrote: > 'ALTER TABLE .. ADD PARTITION

Re: External partition table question

2014-07-17 Thread Satish Mittal
'ALTER TABLE .. ADD PARTITION..' would just a partition entry for the table in hive metastore. It doesn't perform any data loading, instead it expects the data to be loaded already in the file pointed to by LOCATION. On Tue, Jul 15, 2014 at 5:39 AM, Raymond Lau wrote: > I've created an external

External partition table question

2014-07-14 Thread Raymond Lau
I've created an external table partitioned by a field and am attempting to load in the data via the command 'ALTER TABLE partitioned_table_test ADD PARTITION (pcode = '123') LOCATION '/path/to/parquet/files';' using a custom Parquet SerDe. Does loading in the data this way call the serializer() fu

Compression for a HDFS text file - Hive External Partition Table

2013-11-13 Thread Raj Hadoop
Hi ,    1)  My requirement is to load a file ( a tar.gz file which has multiple tab separated values files and one file is the main file which has huge data – about 10 GB per day) to an externally partitioned hive table.   2)  What I am doing is I have automated the process by extracting

Re: External Partition Table

2013-10-31 Thread Raj Hadoop
Tim On Thu, Oct 31, 2013 at 4:34 PM, Raj Hadoop wrote: Hi, > > >I am planning for a Hive External Partition Table based on a date. > > >Which one of the below yields a better performance or both have the same >performance? > > >1) Partition based on one folder per d

Re: External Partition Table

2013-10-31 Thread Timothy Potter
the same performance because Hive is still selecting the same number of input paths in both scenarios, one just happens to be a little deeper. Cheers, Tim On Thu, Oct 31, 2013 at 4:34 PM, Raj Hadoop wrote: > Hi, > > I am planning for a Hive External Partition Table based on a date. >

Re: External Partition Table

2013-10-31 Thread Brad Ruderman
but partitions will help > when date restricting. > > Thx, > Brad > > > On Thu, Oct 31, 2013 at 3:34 PM, Raj Hadoop wrote: > > Hi, > > I am planning for a Hive External Partition Table based on a date. > > Which one of the below yields a better performance

Re: External Partition Table

2013-10-31 Thread Raj Hadoop
> >I am planning for a Hive External Partition Table based on a date. > > >Which one of the below yields a better performance or both have the same >performance? > > >1) Partition based on one folder per day >LIKE date INT >2) Partition based on one folder per year

Re: External Partition Table

2013-10-31 Thread Brad Ruderman
aller number of files is typically preferred but partitions will help when date restricting. Thx, Brad On Thu, Oct 31, 2013 at 3:34 PM, Raj Hadoop wrote: > Hi, > > I am planning for a Hive External Partition Table based on a date. > > Which one of the below yields a better performa

External Partition Table

2013-10-31 Thread Raj Hadoop
Hi, I am planning for a Hive External Partition Table based on a date. Which one of the below yields a better performance or both have the same performance? 1) Partition based on one folder per day LIKE date INT 2) Partition based on one folder per year / month / day ( So it has three folders