[ https://issues.apache.org/jira/browse/HIVE-4627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nick Dimiduk updated HIVE-4627: ------------------------------- Description: I'd like to use Hive to generate HFiles for HBase. I started off by following the instructions on the [wiki|https://cwiki.apache.org/confluence/display/Hive/HBaseBulkLoad], but that took me only so far. TotalOrderPartitioning didn't work. That took me to this [post|http://stackoverflow.com/questions/13715044/hive-cluster-by-vs-order-by-vs-sort-by] which points out that Hive partitions on value instead of key. A patched TOP brings me to this error: {noformat} 2013-05-17 21:00:47,781 WARN org.apache.hadoop.mapred.Child: Error running child java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: No files found in hdfs://ip-10-191-3-134.ec2.internal:8020/tmp/hive-hrt_qa/hive_2013-05-17_20-58-58_357_6896546413926013201/_task_tmp.-ext-10000/_tmp.000000_0 at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:317) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:532) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: No files found in hdfs://ip-10-191-3-134.ec2.internal:8020/tmp/hive-hrt_qa/hive_2013-05-17_20-58-58_357_6896546413926013201/_task_tmp.-ext-10000/_tmp.000000_0 at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:183) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:865) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:309) ... 7 more Caused by: java.io.IOException: No files found in hdfs://ip-10-191-3-134.ec2.internal:8020/tmp/hive-hrt_qa/hive_2013-05-17_20-58-58_357_6896546413926013201/_task_tmp.-ext-10000/_tmp.000000_0 at org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$1.close(HiveHFileOutputFormat.java:142) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:180) ... 11 more {noformat} was: I'd like to use Hive to generate HFiles for HBase. I started off by following the instructions on the [wiki|https://cwiki.apache.org/Hive/hbasebulkload.html], but that took me only so far. TotalOrderPartitioning didn't work. That took me to this [post|http://stackoverflow.com/questions/13715044/hive-cluster-by-vs-order-by-vs-sort-by] which points out that Hive partitions on value instead of key. A patched TOP brings me to this error: {noformat} 2013-05-17 21:00:47,781 WARN org.apache.hadoop.mapred.Child: Error running child java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: No files found in hdfs://ip-10-191-3-134.ec2.internal:8020/tmp/hive-hrt_qa/hive_2013-05-17_20-58-58_357_6896546413926013201/_task_tmp.-ext-10000/_tmp.000000_0 at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:317) at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:532) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) at org.apache.hadoop.mapred.Child.main(Child.java:249) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: No files found in hdfs://ip-10-191-3-134.ec2.internal:8020/tmp/hive-hrt_qa/hive_2013-05-17_20-58-58_357_6896546413926013201/_task_tmp.-ext-10000/_tmp.000000_0 at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:183) at org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:865) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:309) ... 7 more Caused by: java.io.IOException: No files found in hdfs://ip-10-191-3-134.ec2.internal:8020/tmp/hive-hrt_qa/hive_2013-05-17_20-58-58_357_6896546413926013201/_task_tmp.-ext-10000/_tmp.000000_0 at org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$1.close(HiveHFileOutputFormat.java:142) at org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:180) ... 11 more {noformat} bump. updating wiki link. > Total ordering of Hive output > ----------------------------- > > Key: HIVE-4627 > URL: https://issues.apache.org/jira/browse/HIVE-4627 > Project: Hive > Issue Type: Bug > Affects Versions: 0.11.0 > Reporter: Nick Dimiduk > Attachments: 00_tables.ddl, 01_sample.hql, 02_hfiles.hql, > hive-partitioner.patch > > > I'd like to use Hive to generate HFiles for HBase. I started off by following > the instructions on the > [wiki|https://cwiki.apache.org/confluence/display/Hive/HBaseBulkLoad], but > that took me only so far. TotalOrderPartitioning didn't work. That took me to > this > [post|http://stackoverflow.com/questions/13715044/hive-cluster-by-vs-order-by-vs-sort-by] > which points out that Hive partitions on value instead of key. A patched TOP > brings me to this error: > {noformat} > 2013-05-17 21:00:47,781 WARN org.apache.hadoop.mapred.Child: Error running > child > java.lang.RuntimeException: Hive Runtime Error while closing operators: > java.io.IOException: No files found in > hdfs://ip-10-191-3-134.ec2.internal:8020/tmp/hive-hrt_qa/hive_2013-05-17_20-58-58_357_6896546413926013201/_task_tmp.-ext-10000/_tmp.000000_0 > at > org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:317) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:532) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: > java.io.IOException: No files found in > hdfs://ip-10-191-3-134.ec2.internal:8020/tmp/hive-hrt_qa/hive_2013-05-17_20-58-58_357_6896546413926013201/_task_tmp.-ext-10000/_tmp.000000_0 > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:183) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:865) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:588) > at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:597) > at > org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:309) > ... 7 more > Caused by: java.io.IOException: No files found in > hdfs://ip-10-191-3-134.ec2.internal:8020/tmp/hive-hrt_qa/hive_2013-05-17_20-58-58_357_6896546413926013201/_task_tmp.-ext-10000/_tmp.000000_0 > at > org.apache.hadoop.hive.hbase.HiveHFileOutputFormat$1.close(HiveHFileOutputFormat.java:142) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:180) > ... 11 more > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira