Re: Creating partitions causes Error in semantic analysis

Ning Zhang Wed, 16 Jun 2010 23:49:11 -0700

The __HIVE_DEFAULT_PARTITION__ is created by default if the partitioning column 
value (newdatestamp etc.) is NULL or empty string. Below is the wiki page that 
describes the syntax and semantics of dynamic partitioning, including some best 
practices.


http://wiki.apache.org/hadoop/Hive/Tutorial#Dynamic-partition_Insert

Ning

On Jun 16, 2010, at 11:11 PM, Viraj Bhat wrote:

> Hi Yongqiang,
> I am using the trunk code. I figured out what the problem was 
> INSERT OVERWRITE TABLE newtable
> PARTITION (newdatestamp, myregion, myproperty)
> SELECT
> name,
> age,
> datestamp as newdatestamp,
> region as myregion,
> property as myproperty,
> from oldtable where datestamp='20100525';
> 
> I need to specify the last 3 columns in the order of partitions, which I did 
> not. 
> 
> Meanwhile the dynamic partitioning produced a partition which was named 
> "__HIVE_DEFAULT_PARTITION__". Is this created by default? 
> 
> Thanks again for your help.
> Viraj
> 
> -----Original Message-----
> From: yongqiang he [mailto:[email protected]] 
> Sent: Wednesday, June 16, 2010 5:46 PM
> To: [email protected]
> Subject: Re: Creating partitions causes Error in semantic analysis
> 
> Hive supports dynamic partition ( i think you need to use trunk code
> for this feature.?).
> 
> here is an example:
> 
> set hive.exec.dynamic.partition.mode=nonstrict;
> set hive.exec.dynamic.partition=true;
> 
> create table if not exists nzhang_part1 like srcpart;
> create table if not exists nzhang_part2 like srcpart;
> describe extended nzhang_part1;
> 
> from srcpart
> insert overwrite table nzhang_part1 partition (ds, hr) select key,
> value, ds, hr where ds <= '2008-04-08'
> insert overwrite table nzhang_part2 partition(ds='2008-12-31', hr)
> select key, value, hr where ds > '2008-04-08';
> 
> On Wed, Jun 16, 2010 at 4:07 PM, Viraj Bhat <[email protected]> wrote:
>> Hi all,
>> 
>>   I have a table known as "oldtable" which is partitioned by datestamp.
>> 
>> 
>> 
>> The schema of the "oldtable" is:
>> 
>> 
>> 
>> name string
>> 
>> age bigint
>> 
>> property string
>> 
>> region string
>> 
>> datestamp string
>> 
>> 
>> 
>> 
>> 
>> I now need to create a new table which is based of this old table and
>> partitioned by (datestamp, region, property)
>> 
>> 
>> 
>> The DDL for the new table looks like:
>> 
>> 
>> 
>> CREATE EXTERNAL TABLE newtable
>> 
>> (
>> 
>> newname string,
>> 
>> newage bigint,
>> 
>> )
>> 
>> 
>> 
>> PARTITIONED BY (newdatestamp STRING, myregion STRING, myproperty STRING)
>> 
>> 
>> 
>> STORED AS RCFILE
>> 
>> LOCATION '/user/viraj/rcfile;
>> 
>> 
>> 
>> 
>> 
>> When I try to populate this new table from my old table, I try to use
>> partitioning which uses values of old columns.
>> 
>> 
>> 
>> INSERT OVERWRITE TABLE newtable
>> 
>> PARTITION (newdatestamp='20100525', region, property)
>> 
>> SELECT
>> 
>>   name,
>> 
>>   age
>> 
>> from oldtable where datestamp='20100525';
>> 
>> 
>> 
>> The above statement causes an error and expects hardcoded values for region
>> and property.
>> 
>> 
>> 
>> FAILED: Error in semantic analysis: Partition column in the partition
>> specification does not exist.
>> 
>> 
>> 
>> How do I specify the partition information such that the new tables, takes
>> values from "property" and "region" from the old table and uses it as
>> partitions.
>> 
>> 
>> 
>> Is there a better way to achieve the above instead of hard coding values for
>> each and every partition?
>> 
>> 
>> 
>> =======================================================
>> 
>> Addendum: If the above is possible, how can I define some conditions where I
>> need to say, If region is not "us" or "asia", put it in another partition
>> known as misc?
>> 
>> =======================================================
>> 
>> 
>> 
>> 
>> 
>> Thanks Viraj

Re: Creating partitions causes Error in semantic analysis

Reply via email to