Starting in .7 hive introduced indexing, 
https://issues.apache.org/jira/browse/HIVE-417. So indexes are available in 
hive.

Thanks,
Ranjith

On May 12, 2012, at 11:13 PM, Raja Thiruvathuru <thiruvath...@gmail.com> wrote:

> No indexing in hive.
> 
> 
> On Sunday, May 13, 2012, Ranjith wrote:
> Indexes can be built on tables managed by hive. For external tables I do not 
> believe that to be true. Please feel to correct if I am wrong.
> 
> Thanks,
> Ranjith
> 
> On May 12, 2012, at 9:24 PM, Nanda Vijaydev <nanda.vijay...@gmail.com> wrote:
> 
>> In hive, the raw data is in HDFS and there is a metadata layer that defines 
>> the structure of the raw data. Table is usually a reference to metadata, 
>> probably in a mySQL server and it contains a reference to the location of 
>> the data in HDFS, type of delimiter or serde to use and so on.  
>> 1. With hive managed tables, when you drop a table, both the metadata in 
>> mysql and raw data on the cluster gets deleted. 
>> 2. With external tables, when you drop a table, just the metadata gets 
>> deleted and the raw data continues to exist on the cluster. 
>> 
>>  
>> On Thu, May 10, 2012 at 3:02 PM, David Kulp <dk...@fiksu.com> wrote:
>> It's simpler than this.  All files look the same -- and are often very 
>> simple delimited text -- whether managed or external.  The only difference 
>> is that the files associated with a managed table are dropped when the table 
>> is dropped and files that are loaded into a managed table are moved into 
>> hive's private path.  External tables never move or remove files.  
>> Performance is the same.
>> 
>> On May 10, 2012, at 5:52 PM, kulkarni.swar...@gmail.com wrote:
>> 
>> > I am pretty new to hive and was trying to clearly understand the 
>> > difference between a managed and an external table.
>> >
>> > As my current understanding stands, a managed table is a table whose data 
>> > is completely owned by hive whereas an external table is usually created 
>> > to have a hive frontend for the data managed in external systems.I would 
>> > suppose this would mean that a query on an external table goes out to 
>> > fetch data from the given external table, deserialize according to the 
>> > given/suitable SerDe and then show the output of the query in hive format.
>> >
>> > So does this mean that cost of using external tables is much higher than 
>> > the native ones? Or is there some caching that comes into play that I am 
>> > not seeing right now.
>> >
>> > Thanks for the help.
>> >
>> > --
>> > Swarnim
>> 
>> 
> 
> 
> -- 
> 
> Raja Thiruvathuru

Reply via email to