Thanks, Ashutosh and Ed.

Historically, I didn't have much reason choose managed over external tables or 
vice-versa since the semantics were very similar. I chose external because it 
allowed me a better handle on the table metadata. For example, if a new column 
got added to the file, I could just drop the external table and recreate with 
the new schema. With managed, I could do the same using ALTER TABLE commands 
but at that point, not all metadata for the table could be modified using ALTER 
TABLE commands so I decided to go with external tables. I think a lot of people 
use external tables on HDFS in preference to managed tables.

I did see the property hive.insert.into.external.tables but it's a all-or-none 
switch. If I had an HBase external table and a HDFS external table, it might 
very well be the case that I want to be able to insert into the HDFS backed 
external but not the HBase table. So, to me disallowing insert into all the 
external tables doesn't seem like the right thing to do. Like Ed suggested, 
it's dependent on the storage handler not on the table being external. I could 
go ahead and use table locking in that case, but that kinda defeats the purpose 
of this feature and property.

Thoughts?

Mark

----- Original Message -----
From: "Ashutosh Chauhan" <hashut...@apache.org>
To: dev@hive.apache.org
Cc: u...@hive.apache.org
Sent: Friday, June 1, 2012 10:24:24 AM
Subject: Re: Behavior of Hive 2837: insert into external tables should not be 
allowed

Hi Mark, 


I understand your concern w.r.t backward compatibility. But as Ed pointed out 
there is a config variable and by default semantic is unchanged so you can 
continue to insert into your external table. 
I have a question though. Why are you creating all your tables as "external" 
tables ? Why not regular tables? 


Thanks, 
Ashutosh 


On Thu, May 31, 2012 at 9:35 PM, Mark Grover < grover.markgro...@gmail.com > 
wrote: 


Hi folks, 
I have a question regarding HIVE 2837( 
https://issues.apache.org/jira/browse/HIVE-2837 ) that deals with 
disallowing external table from using insert into queries. 

>From looking at the JIRA, it seems like it applies to external tables on 
HDFS as well. Technically, insert into should be ok for external tables on 
HDFS (and S3 as well). Seems like a storage file system level thing to 
specify whether insert into is applied and implement it. 

Historically, there hasn't been any real difference between creating an 
external table on HDFS vs creating a managed one. However, if we disallow 
insert into on external tables, that would mean that folks with external 
tables on HDFS wouldn't be able to make use of insert into functionality 
even though they should be able to. Do we want to allow insert into on HDFS 
tables regardless of whether they are external or not? 

Mark 

Reply via email to