Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The "Hive/Tutorial" page has been changed by Ning Zhang.
http://wiki.apache.org/hadoop/Hive/Tutorial?action=diff&rev1=21&rev2=22

--------------------------------------------------

    * An additional pvs.country column is added in the select statement. This 
is the corresponding input column for the dynamic partition column. Note that 
you do not need to add an input column for the static partition column because 
its value is already known in the PARTITION clause. 
  
  Semantics of the dynamic partition insert statement:
+   * When there are already non-empty partitions exists for the dynamic 
partition columns, (e.g., country='CA' exists under some ds root partition), it 
will be overwritten if the dynamic partition insert saw the same value (say 
'CA') in the input data. This is in line with the 'insert overwrite' semantics. 
However, if the partition value 'CA' does not appear in the input data, the 
existing partition will not be overwritten. 
    * Since a Hive partition corresponds to a directory in HDFS, the partition 
value has to conform to the HDFS path format (URI in Java). Any character 
having a special meaning in URI (e.g., '%', ':', '/', '#') will be escaped with 
'%' followed by 2 bytes of its ASCII value.  
    * If the input column is a type different than STRING, its value will be 
first converted to STRING to be used to construct the HDFS path. 
    * If the input column value is NULL or empty string, the row will be put 
into a special partition, whose name is controlled by the hive parameter 
hive.exec.default.dynamic.partition.name. The default value is 
__HIVE_DEFAULT_PARTITION__. Basically this partition will contain all "bad" 
rows whose value are not valid partition names. The caveat of this approach is 
that the bad value will be lost and is replaced by __HIVE_DEFAULT_PARTITION__ 
if you select them Hive. JIRA HIVE-1309 is a solution to let user specify "bad 
file" to retain the input partition column values as well.

Reply via email to