Hi Ryan,

I was just experimenting with this recently and this is my experience with
"external" tables.  I would imagine regular tables work similarly.

In Hive a partition is actually a folder in HDFS, so if you put another file
in the partition folder, formatted according to the original table
definition, you are in effect "appending" to the partition.

For example, if your table exists as:
/user/hive/warehouse/mytable/

And you have a partition folder:
/user/hive/warehouse/mytable/2010-03-16/

With data files inside it:
/user/hive/warehouse/mytable/2010-03-16/data1
/user/hive/warehouse/mytable/2010-03-16/data2

You can just put more files in the partition folder in HDFS (data3, data4,
etc.) and they will be recognized as part of the partition.

- Yen



On Wed, Mar 17, 2010 at 1:05 PM, Ryan LeCompte <[email protected]> wrote:

> Actually, I wasn't clear earlier... we are currently using this syntax for
> loading data into the table/partition:
>
> INSERT OVERWRITE TABLE ourtable PARTITION(dt='2010-03-16') ...
>
> If I execute this multiple times, I believe the data will simply be
> overwritten instead of appended, right?
>
>
>
>
>
>
> On Wed, Mar 17, 2010 at 4:01 PM, Ryan LeCompte <[email protected]> wrote:
>
>> Awesome! I didn't know this. :) I'll get it a shot, thanks!
>>
>>
>>
>> On Wed, Mar 17, 2010 at 3:57 PM, Edward Capriolo 
>> <[email protected]>wrote:
>>
>>>
>>>
>>> On Wed, Mar 17, 2010 at 3:30 PM, Ryan LeCompte <[email protected]>wrote:
>>>
>>>> Hello all,
>>>>
>>>> Is it possible in Hive 0.5 to run multiple inserts into the same Hive
>>>> table/partition? Or is this not supported due to the fact that Hadoop
>>>> doesn't support appends properly?
>>>>
>>>> For example, it would be nice to periodically add new data every 5
>>>> minutes to a table that has a partition column for "date" via multiple
>>>> periodic INSERT statements.
>>>>
>>>> Thanks!
>>>>
>>>> Ryan
>>>>
>>>> Ryan,
>>>
>>> Every file inside the partition makes up the partiion. So with 'LOAD DATA
>>> INFILE (X)', if X is a unique name it will be "appended".
>>>
>>> This works for us since our 5 minute log files all have unique names .
>>>
>>> Edward
>>>
>>
>>
>

Reply via email to