they will work as long as you put the files in the expected location for 
regular tables.


On Mar 17, 2010, at 1:57 PM, Ryan LeCompte wrote:

This is interesting... thanks for the response.

My tables are not defined as "external" tables, however. I wonder if this would 
still work?

Thanks,
Ryan


On Wed, Mar 17, 2010 at 4:46 PM, Yen Pai 
<[email protected]<mailto:[email protected]>> wrote:
Hi Ryan,

I was just experimenting with this recently and this is my experience with 
"external" tables.  I would imagine regular tables work similarly.

In Hive a partition is actually a folder in HDFS, so if you put another file in 
the partition folder, formatted according to the original table definition, you 
are in effect "appending" to the partition.

For example, if your table exists as:
/user/hive/warehouse/mytable/

And you have a partition folder:
/user/hive/warehouse/mytable/2010-03-16/

With data files inside it:
/user/hive/warehouse/mytable/2010-03-16/data1
/user/hive/warehouse/mytable/2010-03-16/data2

You can just put more files in the partition folder in HDFS (data3, data4, 
etc.) and they will be recognized as part of the partition.

- Yen




On Wed, Mar 17, 2010 at 1:05 PM, Ryan LeCompte 
<[email protected]<mailto:[email protected]>> wrote:
Actually, I wasn't clear earlier... we are currently using this syntax for 
loading data into the table/partition:

INSERT OVERWRITE TABLE ourtable PARTITION(dt='2010-03-16') ...

If I execute this multiple times, I believe the data will simply be overwritten 
instead of appended, right?






On Wed, Mar 17, 2010 at 4:01 PM, Ryan LeCompte 
<[email protected]<mailto:[email protected]>> wrote:
Awesome! I didn't know this. :) I'll get it a shot, thanks!



On Wed, Mar 17, 2010 at 3:57 PM, Edward Capriolo 
<[email protected]<mailto:[email protected]>> wrote:


On Wed, Mar 17, 2010 at 3:30 PM, Ryan LeCompte 
<[email protected]<mailto:[email protected]>> wrote:
Hello all,

Is it possible in Hive 0.5 to run multiple inserts into the same Hive 
table/partition? Or is this not supported due to the fact that Hadoop doesn't 
support appends properly?

For example, it would be nice to periodically add new data every 5 minutes to a 
table that has a partition column for "date" via multiple periodic INSERT 
statements.

Thanks!

Ryan

Ryan,

Every file inside the partition makes up the partiion. So with 'LOAD DATA 
INFILE (X)', if X is a unique name it will be "appended".

This works for us since our 5 minute log files all have unique names .

Edward





Reply via email to