Please see http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL for
how to use "External" table.
You don't need to "load" into external table because external table
can directly point to your data directory.

Zheng

On Wed, Jan 27, 2010 at 11:38 PM, Fu Ecy <[email protected]> wrote:
> hive> CREATE EXTERNAL TABLE collect_info (
>     >
>     >  id string,
>     >  t1 string,
>     >  t2 string,
>     >  t3 string,
>     >  t4 string,
>     >  t5 string,
>     >  collector string)
>     > ROW FORMAT DELIMITED
>     > FIELDS TERMINATED BY '\t'
>     > STORED AS TEXTFILE;
> OK
> Time taken: 0.234 seconds
>
> hive> load data inpath
> '/group/taobao/taobao/dw/stb/20100125/collect_info/coll_9.collect_info575'
> overwrite into table collect_info;
> Loading data to table collect_info
> Failed with exception replaceFiles: error while moving files!!!
> FAILED: Execution Error, return code 1 from
> org.apache.hadoop.hive.ql.exec.MoveTask
>
> It doesn't wok.
>
> 2010/1/28 Fu Ecy <[email protected]>
>>
>> I think this is the problem, I don't have the write permissions to the
>> source files/directories. Thank you, Shao :-)
>>
>> 2010/1/28 Zheng Shao <[email protected]>
>>>
>>> When Hive loads data from HDFS, it moves the files instead of copying the
>>> files.
>>>
>>> That means the current user should have write permissions to the
>>> source files/directories as well.
>>> Can you check that?
>>>
>>> Zheng
>>>
>>> On Wed, Jan 27, 2010 at 11:18 PM, Fu Ecy <[email protected]> wrote:
>>> > <property>
>>> >   <name>hive.metastore.warehouse.dir</name>
>>> >   <value>/group/tbdev/kunlun/henshao/hive/</value>
>>> >   <description>location of default database for the
>>> > warehouse</description>
>>> > </property>
>>> >
>>> > <property>
>>> >   <name>hive.exec.scratchdir</name>
>>> >   <value>/group/tbdev/kunlun/henshao/hive/temp</value>
>>> >   <description>Scratch space for Hive jobs</description>
>>> > </property>
>>> >
>>> > [kun...@gate2 ~]$ hive --config config/ -u root -p root
>>> > Hive history
>>> > file=/tmp/kunlun/hive_job_log_kunlun_201001281514_422659187.txt
>>> > hive> create table pokes (foo int, bar string);
>>> > OK
>>> > Time taken: 0.825 seconds
>>> >
>>> > Yes, I have the permission for Hive's warehouse directory and  tmp
>>> > directory.
>>> >
>>> > 2010/1/28 김영우 <[email protected]>
>>> >>
>>> >> Hi Fu,
>>> >>
>>> >> Your query seems correct but I think, It's a problem related HDFS
>>> >> permission.
>>> >> Did you set right permission for Hive's warehouse directory and  tmp
>>> >> directory?
>>> >> Seems user 'kunlun' does not have WRITE permission for hive warehouse
>>> >> directory.
>>> >>
>>> >> Youngwoo
>>> >>
>>> >> 2010/1/28 Fu Ecy <[email protected]>
>>> >>>
>>> >>> 2010-01-27 12:58:22,182 ERROR ql.Driver
>>> >>> (SessionState.java:printError(303)) - FAILED: Parse Error: line 2:10
>>> >>> cannot
>>> >>> recognize
>>> >>>  input ',' in column type
>>> >>>
>>> >>> org.apache.hadoop.hive.ql.parse.ParseException: line 2:10 cannot
>>> >>> recognize input ',' in column type
>>> >>>
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:357)
>>> >>>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:249)
>>> >>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:290)
>>> >>>         at
>>> >>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:163)
>>> >>>         at
>>> >>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:221)
>>> >>>         at
>>> >>> org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:335)
>>> >>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> >>> Method)
>>> >>>         at
>>> >>>
>>> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >>>         at
>>> >>>
>>> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>> >>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
>>> >>>         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>>> >>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> >>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>> >>>         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
>>> >>>
>>> >>> 2010-01-27 12:58:40,394 ERROR hive.log
>>> >>> (MetaStoreUtils.java:logAndThrowMetaException(570)) - Got exception:
>>> >>> org.apache.hadoop
>>> >>> .security.AccessControlException
>>> >>> org.apache.hadoop.security.AccessControlException: Permission denied:
>>> >>> user=kunlun, access=WR
>>> >>> ITE, inode="user":hadoop:cug-admin:rwxr-xr-x
>>> >>> 2010-01-27 12:58:40,395 ERROR hive.log
>>> >>> (MetaStoreUtils.java:logAndThrowMetaException(571)) -
>>> >>> org.apache.hadoop.security.Acces
>>> >>> sControlException: org.apache.hadoop.security.AccessControlException:
>>> >>> Permission denied: user=kunlun, access=WRITE, inode="us
>>> >>> er":hadoop:cug-admin:rwxr-xr-x
>>> >>>         at
>>> >>> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>> >>> Method)
>>> >>>         at
>>> >>>
>>> >>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>>> >>>         at
>>> >>>
>>> >>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>>> >>>         at
>>> >>> java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58)
>>> >>>         at
>>> >>> org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:831)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:257)
>>> >>>         at
>>> >>> org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1118)
>>> >>>         at
>>> >>> org.apache.hadoop.hive.metastore.Warehouse.mkdirs(Warehouse.java:123)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:505)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:256)
>>> >>>         at
>>> >>> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:254)
>>> >>>         at
>>> >>> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:883)
>>> >>>         at
>>> >>> org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:105)
>>> >>>         at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:388)
>>> >>>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:294)
>>> >>>         at
>>> >>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:163)
>>> >>>         at
>>> >>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:221)
>>> >>>         at
>>> >>> org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:335)
>>> >>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> >>> Method)
>>> >>>         at
>>> >>>
>>> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >>>         at
>>> >>>
>>> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>> >>>         at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
>>> >>>         at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>>> >>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> >>>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>> >>>         at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
>>> >>> Caused by: org.apache.hadoop.ipc.RemoteException:
>>> >>> org.apache.hadoop.security.AccessControlException: Permission denied:
>>> >>> user=
>>> >>> kunlun, access=WRITE, inode="user":hadoop:cug-admin:rwxr-xr-x
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:176)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:157)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.checkPermission(PermissionChecker.java:105)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4400)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4370)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:1771)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:1740)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.hdfs.server.namenode.NameNode.mkdirs(NameNode.java:471)
>>> >>>         at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown
>>> >>> Source)
>>> >>>         at
>>> >>>
>>> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>> >>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481)
>>> >>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894)
>>> >>>
>>> >>>         at org.apache.hadoop.ipc.Client.call(Client.java:697)
>>> >>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>>> >>>         at $Proxy4.mkdirs(Unknown Source)
>>> >>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>> >>> Method)
>>> >>>         at
>>> >>>
>>> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> >>>         at
>>> >>>
>>> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> >>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>>> >>>         at
>>> >>>
>>> >>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>>> >>>         at $Proxy4.mkdirs(Unknown Source)
>>> >>>         at
>>> >>> org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:829)
>>> >>>         ... 22 more
>>> >>>
>>> >>> Is there any problem with the input data format?
>>> >>>
>>> >>> CREATE TABLE collect_info (
>>> >>>   id string,
>>> >>>   t1 string,
>>> >>>   t2 string,
>>> >>>   t3 string,
>>> >>>   t4 string,
>>> >>>   t5 string,
>>> >>>   collector string)
>>> >>> ROW FORMAT DELIMITED
>>> >>> FIELDS TERMINATED BY '\t'
>>> >>> STORED AS TEXTFILE;
>>> >>>
>>> >>> 5290086045      330952255       1       2010-01-26 02:41:27
>>> >>> 0       196050201       2010-01-26 02:41:27     2010-01-26 02:41:27
>>> >>> qijansher93771          0       1048
>>> >>>
>>> >>> Fields are separated by '\t', I want to get the fields mark by red.
>>> >>>
>>> >>> 2010/1/28 Eric Sammer <[email protected]>
>>> >>>>
>>> >>>> On 1/27/10 10:59 PM, Fu Ecy wrote:
>>> >>>> > I want to load some files on HDFS to a hive table, but there is
>>> >>>> > an execption as follow:
>>> >>>> > hive> load data inpath
>>> >>>> > '/group/taobao/taobao/dw/stb/20100125/collect_info/*' into table
>>> >>>> > collect_info;
>>> >>>> > Loading data to table collect_info
>>> >>>> > Failed with exception addFiles: error while moving files!!!
>>> >>>> > FAILED: Execution Error, return code 1 from
>>> >>>> > org.apache.hadoop.hive.ql.exec.MoveTask
>>> >>>> >
>>> >>>> > But, when I download the files from HDFS to local machine, then
>>> >>>> > load
>>> >>>> > them into the table, it works.
>>> >>>> > Data in '/group/taobao/taobao/dw/stb/20100125/collect_info/*' is a
>>> >>>> > little more than 200GB.
>>> >>>> >
>>> >>>> > I need to use the Hive to make some statistics.
>>> >>>> > much thanks :-)
>>> >>>>
>>> >>>> The size of the files shouldn't really matter (move operations
>>> >>>> affect
>>> >>>> metadata only - the blocks aren't rewritten or anything like that).
>>> >>>> Check in your Hive log files (by default in /tmp/<user>/hive.log on
>>> >>>> the
>>> >>>> local machine you run Hive on, I believe) and you should see a stack
>>> >>>> trace with additional information.
>>> >>>>
>>> >>>> Regards.
>>> >>>> --
>>> >>>> Eric Sammer
>>> >>>> [email protected]
>>> >>>> http://esammer.blogspot.com
>>> >>>
>>> >>
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Yours,
>>> Zheng
>>
>
>



-- 
Yours,
Zheng

Reply via email to