Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by JoydeepSensarma: http://wiki.apache.org/hadoop/Hive/LanguageManual/Cli ------------------------------------------------------------------------------ hive> dfs -ls; }}} === Hive Resources === - You can add a file to list of resources using 'add FILE <file>'. This could be a local file or nfs file. - Once files is added to the list of resources, hive query could access this file from any where in the cluster. Otherwise location of - the file must be accessible to all machines in cluster. + + Hive can manage the addition of resources to a session where those resources are available at query execution time. Currently the only supported resource is the FILE type. Any locally accessible file can be added to the session. Once a file is added to a session, hive query can refer to this file by it's name (in map/reduce/transform clauses) and this file is available locally at execution time on the entire hadoop cluster. Hive uses Hadoop's Distributed Cache to distribute the added files to all the machines in the cluster at query execution time. + + Usage: + {{{ + ADD FILE[S] <filepath1> [<filepath2>]* + LIST FILE[S] [<filepath1> <filepath2> ..] + DELETE FILE[S] [<filepath1> <filepath2> ..] + }}} + Example: {{{ - hive> add FILE /tmp/tt.py + hive> add FILE /tmp/tt.py; + hive> list FILES; + /tmp/tt.py hive> from networks a MAP a.networkid USING 'python tt.py' as nn where a.ds = '2009-01-04' limit 10; }}} - + It is not neccessary to add files to the session if the files used in a transform script are available on all machines in the hadoop cluster using the same path name. For example: + * ... MAP a.networkid USING 'wc -l' ...: here wc is an executable available on all machines + * ... MAP a.networkid USING '/home/nfsserv1/hadoopscripts/tt.py' ...: here the tt.py may be accessible via a nfs mount point that's configured similarly on all the cluster nodes. - -
