[ https://issues.apache.org/jira/browse/HIVE-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739197#action_12739197 ]
Zheng Shao commented on HIVE-718: --------------------------------- bq. Zheng, aren't buckets are separate subdirs? they work so sub-dirs should be fine. I tried to add a directory into a table, and then run this. Apparently hadoop file format does not like the sub directory: Buckets are files not directories. {code} > select * from zshao_tt; OK Failed with exception java.io.IOException:Not a file: hdfs://dfs1.data.facebook.com:9000/user/facebook/warehouse/zshao_tt/a 09/08/04 14:49:38 ERROR exec.FetchTask: Failed with exception java.io.IOException:Not a file: hdfs://dfs1.data.facebook.com:9000/user/facebook/warehouse/zshao_tt/a java.io.IOException: Not a file: hdfs://dfs1.data.facebook.com:9000/user/facebook/warehouse/zshao_tt/a at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:231) at org.apache.hadoop.hive.ql.exec.FetchTask.getRecordReader(FetchTask.java:236) at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:291) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:368) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:306) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:166) at org.apache.hadoop.mapred.JobShell.run(JobShell.java:194) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.mapred.JobShell.main(JobShell.java:220) {code} I discussed with Ashish offline on this. I think we still want the atomic property of insert - as a result, we may need to manually expand the input directory into a bunch of files, and feed the files into the map/reduce jobs (instead of the directories). That code is in ExecDriver.java and MapRedTask.java when we set the JobConf. What do you think? > Load data inpath into a new partition without overwrite does not move the file > ------------------------------------------------------------------------------ > > Key: HIVE-718 > URL: https://issues.apache.org/jira/browse/HIVE-718 > Project: Hadoop Hive > Issue Type: Bug > Reporter: Zheng Shao > Attachments: HIVE-718.1.patch, HIVE-718.2.patch, hive-718.txt > > > The bug can be reproduced as following. Note that it only happens for > partitioned tables. The select after the first load returns nothing, while > the second returns the data correctly. > insert.txt in the current local directory contains 3 lines: "a", "b" and "c". > {code} > > create table tmp_insert_test (value string) stored as textfile; > > load data local inpath 'insert.txt' into table tmp_insert_test; > > select * from tmp_insert_test; > a > b > c > > create table tmp_insert_test_p ( value string) partitioned by (ds string) > > stored as textfile; > > load data local inpath 'insert.txt' into table tmp_insert_test_p partition > > (ds = '2009-08-01'); > > select * from tmp_insert_test_p where ds= '2009-08-01'; > > load data local inpath 'insert.txt' into table tmp_insert_test_p partition > > (ds = '2009-08-01'); > > select * from tmp_insert_test_p where ds= '2009-08-01'; > a 2009-08-01 > b 2009-08-01 > d 2009-08-01 > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.