Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The following page has been changed by JoydeepSensarma:
http://wiki.apache.org/hadoop/Hive/LanguageManual/LanguageManual/DML

New page:
There are two primary ways of manipulating data in Hive:

=== Loading files into tables ===

Hive does not do any transformation while loading data into tables. Load 
operations are current pure copy/move operations that move datafiles into 
locations corresponding to Hive tables.

===== Syntax =====
{{{
LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION 
(partcol1=val1, partcol2=val2 ...)]
}}}

===== Synopsis =====

Load operations are current pure copy/move operations that move datafiles into 
locations corresponding to Hive tables.
 * ''filepath'' can be a 
  * relative path, eg: `project/data1`
  * absolute path, eg: `/user/hive/project/data1`
  * a full URI with scheme and (optionally) an authority, eg: 
`hdfs://namenode:9000/user/hive/project/data1`
 * The target can be a table or a partition. If the table is partitioned, then 
one must specify a specific partition of the table by specifying values for all 
of the partitioning columns.
 * ''filepath'' can refer to a file (in which case hive will move the file into 
the table) or it can be a directory (in which case hive will move all the files 
within that directory into the table). In either case ''filepath'' addresses a 
set of files. 
 * If the keyword LOCAL is specified, then:
  * the load command will look for ''filepath'' in the local file system. If a 
relative path is specified - it will be interpreted relative to the current 
directory of the user. User can specify a full URI for local files as well - 
for example: `file:///user/hive/project/data1`
  * the load command will try to copy all the files addressed by ''filepath'' 
to the target filesystem. The target file system is inferred by looking at the 
location attribute of the table. The copied data files will then be moved to 
the table.
 * If the keyword LOCAL is ''not'' specified, then Hive will either use the 
full URI of ''filepath'' if one is specified. Otherwise the following rules are 
applied:
  * If scheme or authority are not specified, Hive will use the scheme and 
authority from hadoop configuration variable `fs.default.name` that specifies 
the Namenode URI.
  * If the path is not absolute - then Hive will interpret it relative to 
`/user/<username>`
  * Hive will ''move'' the files addressed by ''filepath'' into the table (or 
partition)
 * if the OVERWRITE keyword is used then the contents of the target table (or 
partition) will be deleted and replaced with the files referred to by 
''filepath''. Otherwise the files referred by ''filepath'' will be added to the 
table.
  * Note that if the target table (or partition) already has a file whose name 
collides with any of the filenames contained in ''filepath'' - then the 
existing file will be replaced with the new file.

===== Notes =====
* ''filepath'' cannot contain subdirectories.
* If we are not using the keyword LOCAL - ''filepath'' must refer to files 
within the same filesystem as the table (or partition's) location.

=== Inserting data into tables from queries ===

Reply via email to