Please follow Mafish's suggestion of creating an external table. Zheng
On Wed, Jan 27, 2010 at 11:32 PM, Fu Ecy <[email protected]> wrote: > I think this is the problem, I don't have the write permissions to the > source files/directories. Thank you, Shao :-) > > 2010/1/28 Zheng Shao <[email protected]> >> >> When Hive loads data from HDFS, it moves the files instead of copying the >> files. >> >> That means the current user should have write permissions to the >> source files/directories as well. >> Can you check that? >> >> Zheng >> >> On Wed, Jan 27, 2010 at 11:18 PM, Fu Ecy <[email protected]> wrote: >> > <property> >> > <name>hive.metastore.warehouse.dir</name> >> > <value>/group/tbdev/kunlun/henshao/hive/</value> >> > <description>location of default database for the >> > warehouse</description> >> > </property> >> > >> > <property> >> > <name>hive.exec.scratchdir</name> >> > <value>/group/tbdev/kunlun/henshao/hive/temp</value> >> > <description>Scratch space for Hive jobs</description> >> > </property> >> > >> > [kun...@gate2 ~]$ hive --config config/ -u root -p root >> > Hive history >> > file=/tmp/kunlun/hive_job_log_kunlun_201001281514_422659187.txt >> > hive> create table pokes (foo int, bar string); >> > OK >> > Time taken: 0.825 seconds >> > >> > Yes, I have the permission for Hive's warehouse directory and tmp >> > directory. >> > >> > 2010/1/28 김영우 <[email protected]> >> >> >> >> Hi Fu, >> >> >> >> Your query seems correct but I think, It's a problem related HDFS >> >> permission. >> >> Did you set right permission for Hive's warehouse directory and tmp >> >> directory? >> >> Seems user 'kunlun' does not have WRITE permission for hive warehouse >> >> directory. >> >> >> >> Youngwoo >> >> >> >> 2010/1/28 Fu Ecy <[email protected]> >> >>> >> >>> 2010-01-27 12:58:22,182 ERROR ql.Driver >> >>> (SessionState.java:printError(303)) - FAILED: Parse Error: line 2:10 >> >>> cannot >> >>> recognize >> >>> input ',' in column type >> >>> >> >>> org.apache.hadoop.hive.ql.parse.ParseException: line 2:10 cannot >> >>> recognize input ',' in column type >> >>> >> >>> at >> >>> >> >>> org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:357) >> >>> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:249) >> >>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:290) >> >>> at >> >>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:163) >> >>> at >> >>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:221) >> >>> at >> >>> org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:335) >> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >>> at >> >>> >> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >>> at >> >>> >> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >>> at java.lang.reflect.Method.invoke(Method.java:597) >> >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:165) >> >>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) >> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> >>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) >> >>> >> >>> 2010-01-27 12:58:40,394 ERROR hive.log >> >>> (MetaStoreUtils.java:logAndThrowMetaException(570)) - Got exception: >> >>> org.apache.hadoop >> >>> .security.AccessControlException >> >>> org.apache.hadoop.security.AccessControlException: Permission denied: >> >>> user=kunlun, access=WR >> >>> ITE, inode="user":hadoop:cug-admin:rwxr-xr-x >> >>> 2010-01-27 12:58:40,395 ERROR hive.log >> >>> (MetaStoreUtils.java:logAndThrowMetaException(571)) - >> >>> org.apache.hadoop.security.Acces >> >>> sControlException: org.apache.hadoop.security.AccessControlException: >> >>> Permission denied: user=kunlun, access=WRITE, inode="us >> >>> er":hadoop:cug-admin:rwxr-xr-x >> >>> at >> >>> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native >> >>> Method) >> >>> at >> >>> >> >>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) >> >>> at >> >>> >> >>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) >> >>> at >> >>> java.lang.reflect.Constructor.newInstance(Constructor.java:513) >> >>> at >> >>> >> >>> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:96) >> >>> at >> >>> >> >>> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:58) >> >>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:831) >> >>> at >> >>> >> >>> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:257) >> >>> at >> >>> org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1118) >> >>> at >> >>> org.apache.hadoop.hive.metastore.Warehouse.mkdirs(Warehouse.java:123) >> >>> at >> >>> >> >>> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:505) >> >>> at >> >>> >> >>> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:256) >> >>> at >> >>> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:254) >> >>> at >> >>> org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:883) >> >>> at >> >>> org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:105) >> >>> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:388) >> >>> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:294) >> >>> at >> >>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:163) >> >>> at >> >>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:221) >> >>> at >> >>> org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:335) >> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >>> at >> >>> >> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >>> at >> >>> >> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >>> at java.lang.reflect.Method.invoke(Method.java:597) >> >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:165) >> >>> at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54) >> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> >>> at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68) >> >>> Caused by: org.apache.hadoop.ipc.RemoteException: >> >>> org.apache.hadoop.security.AccessControlException: Permission denied: >> >>> user= >> >>> kunlun, access=WRITE, inode="user":hadoop:cug-admin:rwxr-xr-x >> >>> at >> >>> >> >>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:176) >> >>> at >> >>> >> >>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.check(PermissionChecker.java:157) >> >>> at >> >>> >> >>> org.apache.hadoop.hdfs.server.namenode.PermissionChecker.checkPermission(PermissionChecker.java:105) >> >>> at >> >>> >> >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4400) >> >>> at >> >>> >> >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4370) >> >>> at >> >>> >> >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:1771) >> >>> at >> >>> >> >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:1740) >> >>> at >> >>> >> >>> org.apache.hadoop.hdfs.server.namenode.NameNode.mkdirs(NameNode.java:471) >> >>> at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown >> >>> Source) >> >>> at >> >>> >> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >>> at java.lang.reflect.Method.invoke(Method.java:597) >> >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:481) >> >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:894) >> >>> >> >>> at org.apache.hadoop.ipc.Client.call(Client.java:697) >> >>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216) >> >>> at $Proxy4.mkdirs(Unknown Source) >> >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >>> at >> >>> >> >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >>> at >> >>> >> >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >>> at java.lang.reflect.Method.invoke(Method.java:597) >> >>> at >> >>> >> >>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) >> >>> at >> >>> >> >>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) >> >>> at $Proxy4.mkdirs(Unknown Source) >> >>> at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:829) >> >>> ... 22 more >> >>> >> >>> Is there any problem with the input data format? >> >>> >> >>> CREATE TABLE collect_info ( >> >>> id string, >> >>> t1 string, >> >>> t2 string, >> >>> t3 string, >> >>> t4 string, >> >>> t5 string, >> >>> collector string) >> >>> ROW FORMAT DELIMITED >> >>> FIELDS TERMINATED BY '\t' >> >>> STORED AS TEXTFILE; >> >>> >> >>> 5290086045 330952255 1 2010-01-26 02:41:27 >> >>> 0 196050201 2010-01-26 02:41:27 2010-01-26 02:41:27 >> >>> qijansher93771 0 1048 >> >>> >> >>> Fields are separated by '\t', I want to get the fields mark by red. >> >>> >> >>> 2010/1/28 Eric Sammer <[email protected]> >> >>>> >> >>>> On 1/27/10 10:59 PM, Fu Ecy wrote: >> >>>> > I want to load some files on HDFS to a hive table, but there is >> >>>> > an execption as follow: >> >>>> > hive> load data inpath >> >>>> > '/group/taobao/taobao/dw/stb/20100125/collect_info/*' into table >> >>>> > collect_info; >> >>>> > Loading data to table collect_info >> >>>> > Failed with exception addFiles: error while moving files!!! >> >>>> > FAILED: Execution Error, return code 1 from >> >>>> > org.apache.hadoop.hive.ql.exec.MoveTask >> >>>> > >> >>>> > But, when I download the files from HDFS to local machine, then >> >>>> > load >> >>>> > them into the table, it works. >> >>>> > Data in '/group/taobao/taobao/dw/stb/20100125/collect_info/*' is a >> >>>> > little more than 200GB. >> >>>> > >> >>>> > I need to use the Hive to make some statistics. >> >>>> > much thanks :-) >> >>>> >> >>>> The size of the files shouldn't really matter (move operations affect >> >>>> metadata only - the blocks aren't rewritten or anything like that). >> >>>> Check in your Hive log files (by default in /tmp/<user>/hive.log on >> >>>> the >> >>>> local machine you run Hive on, I believe) and you should see a stack >> >>>> trace with additional information. >> >>>> >> >>>> Regards. >> >>>> -- >> >>>> Eric Sammer >> >>>> [email protected] >> >>>> http://esammer.blogspot.com >> >>> >> >> >> > >> > >> >> >> >> -- >> Yours, >> Zheng > > -- Yours, Zheng
