[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user
[ https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015794#comment-13015794 ] jirapos...@reviews.apache.org commented on HIVE-1988: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/528/#review386 --- http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java https://reviews.apache.org/r/528/#comment734 HadoopShims.isSecureShimImpl() is not called anywhere else. Shall we remove it if not required anymore? http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java https://reviews.apache.org/r/528/#comment735 Do you want to move this into setup(), as it is common in both testcases? http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java https://reviews.apache.org/r/528/#comment736 code looks duplicated. Can it be refactored by passing group names to a method? - Amareshwari On 2011-03-29 10:26:38, Devaraj Das wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/528/ bq. --- bq. bq. (Updated 2011-03-29 10:26:38) bq. bq. bq. Review request for hive. bq. bq. bq. Summary bq. --- bq. bq. Fixes to some security issues discussed in HIVE-1988 bq. bq. bq. This addresses bug HIVE-1988. bq. https://issues.apache.org/jira/browse/HIVE-1988 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java 1085623 bq. bq. Diff: https://reviews.apache.org/r/528/diff bq. bq. bq. Testing bq. --- bq. bq. New unit test added and that passes. All unit tests passed. bq. bq. bq. Thanks, bq. bq. Devaraj bq. bq. Make the delegation token issued by the MetaStore owned by the right user - Key: HIVE-1988 URL: https://issues.apache.org/jira/browse/HIVE-1988 Project: Hive Issue Type: Bug Components: Metastore, Security, Server Infrastructure Affects Versions: 0.7.0 Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.8.0 Attachments: hive-1988-3.patch, hive-1988.patch The 'owner' of any delegation token issued by the MetaStore is set to the requesting user. When a delegation token is asked by the user himself during a job submission, this is fine. However, in the case where the token is requested for by services (e.g., Oozie), on behalf of the user, the token's owner is set to the user the service is running as. Later on, when the token is used by a MapReduce task, the MetaStore treats the incoming request as coming from Oozie and does operations as Oozie. This means any new directory creations (e.g., create_table) on the hdfs by the MetaStore will end up with Oozie as the owner. Also, the MetaStore doesn't check whether a user asking for a token on behalf of some other user, is actually authorized to act on behalf of that other user. We should start using the ProxyUser authorization in the MetaStore (HADOOP-6510's APIs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user
[ https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015795#comment-13015795 ] Amareshwari Sriramadasu commented on HIVE-1988: --- Changes look good overall. I updated the review board with some minor comments. You can upload the next patch with generated code. Make the delegation token issued by the MetaStore owned by the right user - Key: HIVE-1988 URL: https://issues.apache.org/jira/browse/HIVE-1988 Project: Hive Issue Type: Bug Components: Metastore, Security, Server Infrastructure Affects Versions: 0.7.0 Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.8.0 Attachments: hive-1988-3.patch, hive-1988.patch The 'owner' of any delegation token issued by the MetaStore is set to the requesting user. When a delegation token is asked by the user himself during a job submission, this is fine. However, in the case where the token is requested for by services (e.g., Oozie), on behalf of the user, the token's owner is set to the user the service is running as. Later on, when the token is used by a MapReduce task, the MetaStore treats the incoming request as coming from Oozie and does operations as Oozie. This means any new directory creations (e.g., create_table) on the hdfs by the MetaStore will end up with Oozie as the owner. Also, the MetaStore doesn't check whether a user asking for a token on behalf of some other user, is actually authorized to act on behalf of that other user. We should start using the ProxyUser authorization in the MetaStore (HADOOP-6510's APIs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2032) create database does not honour warehouse.dir in dbproperties
[ https://issues.apache.org/jira/browse/HIVE-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015818#comment-13015818 ] Thiruvel Thirumoolan commented on HIVE-2032: @Amareshwari Post altering, new tables will be created under new location. Old tables' have the fully qualified location in metadata and they should continue to work as before. The reasons I went with alter location are: 1. Allow migration to happen if one would like to reorganize or if quota runs out. Not sure how many folks have this situation. 2. One can migrate new tables to another DFS cluster (existing cluster becoming full). 3. One can migrate between file systems if they have sufficient use cases. Do you think these are valid use cases? create database does not honour warehouse.dir in dbproperties - Key: HIVE-2032 URL: https://issues.apache.org/jira/browse/HIVE-2032 Project: Hive Issue Type: Bug Components: Clients Affects Versions: 0.7.0, 0.8.0 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.8.0 Attachments: DatabaseLocation.patch # create database db with dbproperties ('hive.metastore.warehouse.dir' = 'loc'); The above command does not set location of 'db' to 'loc'. It instead creates 'db.db' under the warehouse directory configured in hive-site.xml of CLI. Looks conflicting with HIVE-1820's expectation. If scratch dir is specified here, that is honoured. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2032) create database does not honour warehouse.dir in dbproperties
[ https://issues.apache.org/jira/browse/HIVE-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015925#comment-13015925 ] Thiruvel Thirumoolan commented on HIVE-2032: Use cases make sense. But, drop database would remove tables only from new location. If I am not wrong, drop db succeeds only if all tables under it are dropped. Thiruvel, did you get a chance to test this? Because, your changes in patch does not look complete. Changes should be propagated to PersistentManager through ObjectStore. Will take a look. create database does not honour warehouse.dir in dbproperties - Key: HIVE-2032 URL: https://issues.apache.org/jira/browse/HIVE-2032 Project: Hive Issue Type: Bug Components: Clients Affects Versions: 0.7.0, 0.8.0 Reporter: Thiruvel Thirumoolan Assignee: Thiruvel Thirumoolan Fix For: 0.8.0 Attachments: DatabaseLocation.patch # create database db with dbproperties ('hive.metastore.warehouse.dir' = 'loc'); The above command does not set location of 'db' to 'loc'. It instead creates 'db.db' under the warehouse directory configured in hive-site.xml of CLI. Looks conflicting with HIVE-1820's expectation. If scratch dir is specified here, that is honoured. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1538) FilterOperator is applied twice with ppd on.
[ https://issues.apache.org/jira/browse/HIVE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015943#comment-13015943 ] jirapos...@reviews.apache.org commented on HIVE-1538: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/550/ --- Review request for hive, Yongqiang He and namit jain. Summary --- Patch updated to trunk with newly added configuration var hive.ppd.remove.duplicatefilters This addresses bug HIVE-1538. https://issues.apache.org/jira/browse/HIVE-1538 Diffs - trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1088944 trunk/contrib/src/test/results/clientpositive/dboutput.q.out 1088944 trunk/contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1088944 trunk/hbase-handler/src/test/results/hbase_pushdown.q.out 1088944 trunk/hbase-handler/src/test/results/hbase_queries.q.out 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpWalkerInfo.java 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java 1088944 trunk/ql/src/test/queries/clientpositive/ppd1.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_clusterby.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_constant_expr.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_gby.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_gby2.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_gby_join.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_join.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_join2.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_join3.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_multi_insert.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_outer_join1.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_outer_join2.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_outer_join3.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_outer_join4.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_random.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_transform.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_udf_case.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_union.q 1088944 trunk/ql/src/test/results/clientpositive/auto_join0.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join11.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join12.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join13.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join14.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join16.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join19.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join20.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join21.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join23.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join27.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join4.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join5.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join6.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join7.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join8.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join9.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucket2.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucket3.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucket4.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucket_groupby.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 1088944 trunk/ql/src/test/results/clientpositive/case_sensitivity.q.out 1088944 trunk/ql/src/test/results/clientpositive/cast1.q.out 1088944 trunk/ql/src/test/results/clientpositive/cluster.q.out 1088944 trunk/ql/src/test/results/clientpositive/combine2.q.out 1088944 trunk/ql/src/test/results/clientpositive/create_view.q.out 1088944 trunk/ql/src/test/results/clientpositive/disable_merge_for_bucketing.q.out 1088944
[jira] [Updated] (HIVE-1538) FilterOperator is applied twice with ppd on.
[ https://issues.apache.org/jira/browse/HIVE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Amareshwari Sriramadasu updated HIVE-1538: -- Attachment: patch-1538-3.txt Patch updated to trunk FilterOperator is applied twice with ppd on. Key: HIVE-1538 URL: https://issues.apache.org/jira/browse/HIVE-1538 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Attachments: patch-1538-1.txt, patch-1538-2.txt, patch-1538-3.txt, patch-1538.txt With hive.optimize.ppd set to true, FilterOperator is applied twice. And it seems second operator is always filtering zero rows. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Query Regarding HIVE-1844.
Hello He Yongqiang /All, I was going through the defect HIVE-1844, but I couldn't able to reproduce the scenario using Hive 0.5 version , though I saw some OOM consistently while Copy Task @ server side, but the client didn't hanged. As per you what could have made client hanged? In my case, Hive client was able to get proper response from thrift whenever OOM occurred at Server side. like , java.sql.SQLException: org.apache.thrift.TApplicationException : Internal error processing execute Kindly provide me pointers on reproducing it. Do I need to do more regression on it? Just a thought/observation, And as per code change , why the OOM was caught too early (that too in the form of Throw able, which will eat other exception as well) ? It would have been eventually caught by ThriftHive$Processor$execute.process() and appropriate actions would have been taken, so I was wondering how the code change helped preventing client hang? Thanks and Regards, -Mohit
Review Request: HIVE-1538
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/550/ --- Review request for hive, Yongqiang He and namit jain. Summary --- Patch updated to trunk with newly added configuration var hive.ppd.remove.duplicatefilters This addresses bug HIVE-1538. https://issues.apache.org/jira/browse/HIVE-1538 Diffs - trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1088944 trunk/contrib/src/test/results/clientpositive/dboutput.q.out 1088944 trunk/contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1088944 trunk/hbase-handler/src/test/results/hbase_pushdown.q.out 1088944 trunk/hbase-handler/src/test/results/hbase_queries.q.out 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpWalkerInfo.java 1088944 trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java 1088944 trunk/ql/src/test/queries/clientpositive/ppd1.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_clusterby.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_constant_expr.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_gby.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_gby2.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_gby_join.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_join.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_join2.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_join3.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_multi_insert.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_outer_join1.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_outer_join2.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_outer_join3.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_outer_join4.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_random.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_transform.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_udf_case.q 1088944 trunk/ql/src/test/queries/clientpositive/ppd_union.q 1088944 trunk/ql/src/test/results/clientpositive/auto_join0.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join11.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join12.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join13.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join14.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join16.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join19.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join20.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join21.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join23.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join27.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join4.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join5.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join6.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join7.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join8.q.out 1088944 trunk/ql/src/test/results/clientpositive/auto_join9.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucket2.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucket3.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucket4.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucket_groupby.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out 1088944 trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 1088944 trunk/ql/src/test/results/clientpositive/case_sensitivity.q.out 1088944 trunk/ql/src/test/results/clientpositive/cast1.q.out 1088944 trunk/ql/src/test/results/clientpositive/cluster.q.out 1088944 trunk/ql/src/test/results/clientpositive/combine2.q.out 1088944 trunk/ql/src/test/results/clientpositive/create_view.q.out 1088944 trunk/ql/src/test/results/clientpositive/disable_merge_for_bucketing.q.out 1088944 trunk/ql/src/test/results/clientpositive/filter_join_breaktask.q.out 1088944 trunk/ql/src/test/results/clientpositive/groupby_map_ppr.q.out 1088944 trunk/ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out 1088944
[jira] [Created] (HIVE-2091) Test scripts need to be made deterministic in their output
Test scripts need to be made deterministic in their output -- Key: HIVE-2091 URL: https://issues.apache.org/jira/browse/HIVE-2091 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.7.0 Reporter: Roman Shaposhnik Priority: Minor Currently this 2 query scripts generate non-deterministic output: The suggestion is to use GROUP BY statement. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output
[ https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated HIVE-2091: --- Description: Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use GROUP BY statement. was: Currently this 2 query scripts generate non-deterministic output: The suggestion is to use GROUP BY statement. Summary: Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output (was: Test scripts need to be made deterministic in their output) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output - Key: HIVE-2091 URL: https://issues.apache.org/jira/browse/HIVE-2091 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.7.0 Reporter: Roman Shaposhnik Priority: Minor Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use GROUP BY statement. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query select xx,xx from xxx LIMIT xxx if no filtering or aggregation
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-2068: - Status: Open (was: Patch Available) Speed up query select xx,xx from xxx LIMIT xxx if no filtering or aggregation --- Key: HIVE-2068 URL: https://issues.apache.org/jira/browse/HIVE-2068 Project: Hive Issue Type: Improvement Reporter: Siying Dong Assignee: Siying Dong Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch Currently, select xx,xx from xxx where ...(only partition conditions) LIMIT xxx will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output
[ https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated HIVE-2091: --- Status: Patch Available (was: Open) Please take a look at the attached patch Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output - Key: HIVE-2091 URL: https://issues.apache.org/jira/browse/HIVE-2091 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.7.0 Reporter: Roman Shaposhnik Priority: Minor Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use GROUP BY statement. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output
[ https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roman Shaposhnik updated HIVE-2091: --- Attachment: HIVE-2091.patch Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output - Key: HIVE-2091 URL: https://issues.apache.org/jira/browse/HIVE-2091 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.7.0 Reporter: Roman Shaposhnik Priority: Minor Attachments: HIVE-2091.patch Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use GROUP BY statement. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output
[ https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2091: - Description: Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use ORDER BY statement. was: Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use GROUP BY statement. Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output - Key: HIVE-2091 URL: https://issues.apache.org/jira/browse/HIVE-2091 Project: Hive Issue Type: Bug Components: Testing Infrastructure Affects Versions: 0.7.0 Reporter: Roman Shaposhnik Priority: Minor Attachments: HIVE-2091.patch Currently this 2 query scripts generate non-deterministic output: * ql/src/test/queries/clientpositive/rcfile_columnar.q * ql/src/test/queries/clientpositive/join_filters.q The suggestion is to use ORDER BY statement. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.7.0-h0.20 #66
See https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/66/ -- [...truncated 27402 lines...] [junit] Hive history file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104051246_596912665.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: CREATETABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: CREATETABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath 'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] POSTHOOK: query: load data local inpath 'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable [junit] PREHOOK: type: QUERY [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/hudson/hive_2011-04-05_12-46-50_400_7485125692539403914/-mr-1 [junit] Total MapReduce jobs = 1 [junit] Launching Job 1 out of 1 [junit] Number of reduce tasks determined at compile time: 1 [junit] In order to change the average load for a reducer (in bytes): [junit] set hive.exec.reducers.bytes.per.reducer=number [junit] In order to limit the maximum number of reducers: [junit] set hive.exec.reducers.max=number [junit] In order to set a constant number of reducers: [junit] set mapred.reduce.tasks=number [junit] Job running in-process (local Hadoop) [junit] 2011-04-05 12:46:53,444 null map = 100%, reduce = 100% [junit] Ended Job = job_local_0001 [junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable [junit] POSTHOOK: type: QUERY [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/hudson/hive_2011-04-05_12-46-50_400_7485125692539403914/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104051246_1722588719.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: CREATETABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: CREATETABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath 'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] POSTHOOK: query: load data local inpath 'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select * from testhivedrivertable limit 10 [junit] PREHOOK: type: QUERY [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/hudson/hive_2011-04-05_12-46-55_145_3892655606022687079/-mr-1 [junit] POSTHOOK: query: select * from testhivedrivertable limit 10 [junit] POSTHOOK: type: QUERY [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/hudson/hive_2011-04-05_12-46-55_145_3892655606022687079/-mr-1 [junit] OK [junit] PREHOOK: query: drop table
[jira] [Resolved] (HIVE-2072) test
[ https://issues.apache.org/jira/browse/HIVE-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-2072. -- Resolution: Incomplete test Key: HIVE-2072 URL: https://issues.apache.org/jira/browse/HIVE-2072 Project: Hive Issue Type: Test Reporter: YoungYik Priority: Trivial -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2066) Metastore Schema Scripts
[ https://issues.apache.org/jira/browse/HIVE-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016098#comment-13016098 ] Carl Steinbach commented on HIVE-2066: -- This ticket is being used as a convenient public storage space for versioned dumps of the Hive MetaStore database schema. Metastore Schema Scripts Key: HIVE-2066 URL: https://issues.apache.org/jira/browse/HIVE-2066 Project: Hive Issue Type: Task Components: Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: hive-schema-0.3.0.derby.sql, hive-schema-0.3.0.mysql.sql, hive-schema-0.3.0.postgres.sql, hive-schema-0.4.0.derby.sql, hive-schema-0.4.0.mysql.sql, hive-schema-0.4.0.postgres.sql, hive-schema-0.4.1.derby.sql, hive-schema-0.4.1.mysql.sql, hive-schema-0.4.1.postgres.sql, hive-schema-0.5.0.derby.sql, hive-schema-0.5.0.mysql.sql, hive-schema-0.5.0.postgres.sql, hive-schema-0.6.0.derby.sql, hive-schema-0.6.0.mysql.sql, hive-schema-0.6.0.postgres.sql, hive-schema-0.7.0.derby.sql, hive-schema-0.7.0.mysql.sql, hive-schema-0.7.0.postgres.sql -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2066) Metastore Schema Scripts
[ https://issues.apache.org/jira/browse/HIVE-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-2066. -- Resolution: Not A Problem Metastore Schema Scripts Key: HIVE-2066 URL: https://issues.apache.org/jira/browse/HIVE-2066 Project: Hive Issue Type: Task Components: Metastore Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: hive-schema-0.3.0.derby.sql, hive-schema-0.3.0.mysql.sql, hive-schema-0.3.0.postgres.sql, hive-schema-0.4.0.derby.sql, hive-schema-0.4.0.mysql.sql, hive-schema-0.4.0.postgres.sql, hive-schema-0.4.1.derby.sql, hive-schema-0.4.1.mysql.sql, hive-schema-0.4.1.postgres.sql, hive-schema-0.5.0.derby.sql, hive-schema-0.5.0.mysql.sql, hive-schema-0.5.0.postgres.sql, hive-schema-0.6.0.derby.sql, hive-schema-0.6.0.mysql.sql, hive-schema-0.6.0.postgres.sql, hive-schema-0.7.0.derby.sql, hive-schema-0.7.0.mysql.sql, hive-schema-0.7.0.postgres.sql -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-1668) Move HWI out to Github
[ https://issues.apache.org/jira/browse/HIVE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-1668. -- Resolution: Not A Problem Move HWI out to Github -- Key: HIVE-1668 URL: https://issues.apache.org/jira/browse/HIVE-1668 Project: Hive Issue Type: Improvement Components: Web UI Reporter: Jeff Hammerbacher I have seen HWI cause a number of build and test errors, and it's now going to cost us some extra work for integration with security. We've worked on hundreds of clusters at Cloudera and I've never seen anyone use HWI. With the Beeswax UI available in Hue, it's unlikely that anyone would prefer to stick with HWI. I think it's time to move it out to Github. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2093) inputs are outputs should be populated for create/drop database
inputs are outputs should be populated for create/drop database --- Key: HIVE-2093 URL: https://issues.apache.org/jira/browse/HIVE-2093 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Siying Dong This is needed for many other things: concurrency, authorization etc. to work -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2092) support 'drop database DBNAME force';
support 'drop database DBNAME force'; --- Key: HIVE-2092 URL: https://issues.apache.org/jira/browse/HIVE-2092 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Siying Dong Currently, the above command fails if the database is not empty. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2092) support 'drop database DBNAME force';
[ https://issues.apache.org/jira/browse/HIVE-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-2092. -- Resolution: Duplicate Duplicate of HIVE-2090 support 'drop database DBNAME force'; --- Key: HIVE-2092 URL: https://issues.apache.org/jira/browse/HIVE-2092 Project: Hive Issue Type: New Feature Reporter: Namit Jain Assignee: Siying Dong Currently, the above command fails if the database is not empty. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2090) Add DROP DATABASE ... FORCE
[ https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-2090: -- Attachment: HIVE-2090.2.patch This is an in-progress patch. It fixed the syntax to CASCADE/RESTRICT instead of FORCE. While we had some discussion offline and decided to do the logic in object store level, so I need to make some more changes. We'll open other issues for fixing concurrency and authorization around dropping and creating databases. Add DROP DATABASE ... FORCE - Key: HIVE-2090 URL: https://issues.apache.org/jira/browse/HIVE-2090 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch A DROP DATABASE ... FORCE will be useful, when we use a database for isolation when doing some tests. Being able to force cleaning up the database will make test cleaning up easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2094) CREATE and DROP DATABASE doesn't check user permission for doing it
CREATE and DROP DATABASE doesn't check user permission for doing it --- Key: HIVE-2094 URL: https://issues.apache.org/jira/browse/HIVE-2094 Project: Hive Issue Type: Bug Reporter: Siying Dong Assignee: He Yongqiang We need to make sure only users with system permission to do it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marquis Wang updated HIVE-1803: --- Status: Patch Available (was: Open) Implement bitmap indexing in Hive - Key: HIVE-1803 URL: https://issues.apache.org/jira/browse/HIVE-1803 Project: Hive Issue Type: New Feature Components: Indexing Reporter: Marquis Wang Assignee: Marquis Wang Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.patch Implement bitmap index handler to complement compact indexing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marquis Wang updated HIVE-1803: --- Attachment: unit-tests.patch HIVE-1803.11.patch New patch that fixes the minor javadocs comments from patch 10. A unit-tests patch that updates all the unit tests that were affected by the virtual column change. Implement bitmap indexing in Hive - Key: HIVE-1803 URL: https://issues.apache.org/jira/browse/HIVE-1803 Project: Hive Issue Type: New Feature Components: Indexing Reporter: Marquis Wang Assignee: Marquis Wang Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.patch Implement bitmap index handler to complement compact indexing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-1988
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/528/ --- (Updated 2011-04-05 21:24:34.129643) Review request for hive. Changes --- Addressed Amareshwari's comments. Summary --- Fixes to some security issues discussed in HIVE-1988 This addresses bug HIVE-1988. https://issues.apache.org/jira/browse/HIVE-1988 Diffs (updated) - http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java 1089155 Diff: https://reviews.apache.org/r/528/diff Testing --- New unit test added and that passes. All unit tests passed. Thanks, Devaraj
[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user
[ https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016128#comment-13016128 ] jirapos...@reviews.apache.org commented on HIVE-1988: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/528/ --- (Updated 2011-04-05 21:24:34.129643) Review request for hive. Changes --- Addressed Amareshwari's comments. Summary --- Fixes to some security issues discussed in HIVE-1988 This addresses bug HIVE-1988. https://issues.apache.org/jira/browse/HIVE-1988 Diffs (updated) - http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java 1089155 Diff: https://reviews.apache.org/r/528/diff Testing --- New unit test added and that passes. All unit tests passed. Thanks, Devaraj Make the delegation token issued by the MetaStore owned by the right user - Key: HIVE-1988 URL: https://issues.apache.org/jira/browse/HIVE-1988 Project: Hive Issue Type: Bug Components: Metastore, Security, Server Infrastructure Affects Versions: 0.7.0 Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.8.0 Attachments: hive-1988-3.patch, hive-1988.patch The 'owner' of any delegation token issued by the MetaStore is set to the requesting user. When a delegation token is asked by the user himself during a job submission, this is fine. However, in the case where the token is requested for by services (e.g., Oozie), on behalf of the user, the token's owner is set to the user the service is running as. Later on, when the token is used by a MapReduce task, the MetaStore treats the incoming request as coming from Oozie and does operations as Oozie. This means any new directory creations (e.g., create_table) on the hdfs by the MetaStore will end up with Oozie as the owner. Also, the MetaStore doesn't check whether a user asking for a token on behalf of some other user, is actually authorized to act on behalf of that other user. We should start using the ProxyUser authorization in the MetaStore (HADOOP-6510's APIs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user
[ https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016129#comment-13016129 ] jirapos...@reviews.apache.org commented on HIVE-1988: - bq. On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote: bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java, line 152 bq. https://reviews.apache.org/r/528/diff/2/?file=14844#file14844line152 bq. bq. HadoopShims.isSecureShimImpl() is not called anywhere else. Shall we remove it if not required anymore? I suggest we leave it there. This seems like a useful method, and I am actually using it in another patch. bq. On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote: bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java, lines 144-156 bq. https://reviews.apache.org/r/528/diff/2/?file=14850#file14850line144 bq. bq. Do you want to move this into setup(), as it is common in both testcases? Done bq. On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote: bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java, lines 192-209 bq. https://reviews.apache.org/r/528/diff/2/?file=14850#file14850line192 bq. bq. code looks duplicated. Can it be refactored by passing group names to a method? Done - Devaraj --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/528/#review386 --- On 2011-03-29 10:26:38, Devaraj Das wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/528/ bq. --- bq. bq. (Updated 2011-03-29 10:26:38) bq. bq. bq. Review request for hive. bq. bq. bq. Summary bq. --- bq. bq. Fixes to some security issues discussed in HIVE-1988 bq. bq. bq. This addresses bug HIVE-1988. bq. https://issues.apache.org/jira/browse/HIVE-1988 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java 1085623 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java 1085623 bq. bq. Diff: https://reviews.apache.org/r/528/diff bq. bq. bq. Testing bq. --- bq. bq. New unit test added and that passes. All unit tests passed. bq. bq. bq. Thanks, bq. bq. Devaraj bq. bq. Make the delegation token issued by the MetaStore owned by the right user - Key: HIVE-1988 URL: https://issues.apache.org/jira/browse/HIVE-1988 Project: Hive Issue Type: Bug Components: Metastore, Security, Server Infrastructure Affects Versions: 0.7.0 Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.8.0 Attachments: hive-1988-3.patch, hive-1988.patch The 'owner' of any delegation token issued by the MetaStore is set to the requesting user. When a delegation token is asked by the user himself during a job submission, this is fine. However, in the case where the token is requested for by services (e.g., Oozie), on behalf of the user, the token's owner is set to the user the service is running as. Later on, when the token is used by a MapReduce task, the MetaStore treats the incoming request as coming from Oozie and does operations as Oozie. This means any new directory creations (e.g., create_table) on the hdfs by the MetaStore will end up with Oozie as the owner. Also, the MetaStore doesn't check whether a user asking for a token on behalf of some other user, is actually authorized to
[jira] [Updated] (HIVE-867) Add add UDFs found in mysql
[ https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-867: Component/s: UDF Add add UDFs found in mysql --- Key: HIVE-867 URL: https://issues.apache.org/jira/browse/HIVE-867 Project: Hive Issue Type: New Feature Components: UDF Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, hive-867-3.diff, hive-867-7.diff Some UDF's that mysql has that hive does not. atan aes_decrypt aes_encrypt bit_and bit_count bit_length bit_or bit_xor char_length char character_length collation compress crc32 encode encrypt format greatest in inet_oton inet_ntoa match md5 oct ord pi radians sha1 _sha sign sleep truncate -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2061) Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility
[ https://issues.apache.org/jira/browse/HIVE-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-2061. -- Resolution: Fixed Fix Version/s: 0.8.0 Hadoop Flags: [Reviewed] Committed to trunk. Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility -- Key: HIVE-2061 URL: https://issues.apache.org/jira/browse/HIVE-2061 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Ning Zhang Assignee: Ning Zhang Priority: Minor Fix For: 0.8.0 Attachments: HIVE-2061.patch We have seen a use case where in the user's script, it run 'add jar hive_contrib.jar'. Since Hive has moved the jar file to be hive-contrib-{version}.jar, it introduced backward incompatibility. If we as the user to change the script and when Hive upgrade version again, the user need to change the script again. Creating a symlink seems to be the best solution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2061) Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility
[ https://issues.apache.org/jira/browse/HIVE-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2061: - Component/s: Build Infrastructure Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility -- Key: HIVE-2061 URL: https://issues.apache.org/jira/browse/HIVE-2061 Project: Hive Issue Type: Bug Components: Build Infrastructure Reporter: Ning Zhang Assignee: Ning Zhang Priority: Minor Fix For: 0.8.0 Attachments: HIVE-2061.patch We have seen a use case where in the user's script, it run 'add jar hive_contrib.jar'. Since Hive has moved the jar file to be hive-contrib-{version}.jar, it introduced backward incompatibility. If we as the user to change the script and when Hive upgrade version again, the user need to change the script again. Creating a symlink seems to be the best solution. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1985) better error message for selecting non-existing columns
[ https://issues.apache.org/jira/browse/HIVE-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1985: - Component/s: Query Processor Diagnosability better error message for selecting non-existing columns --- Key: HIVE-1985 URL: https://issues.apache.org/jira/browse/HIVE-1985 Project: Hive Issue Type: Improvement Components: Diagnosability, Query Processor Reporter: He Yongqiang Should have an error message for a query like : select a.key,a,a.value from src a; -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2057) eliminate parser warning for Identifier DOT Identifier
[ https://issues.apache.org/jira/browse/HIVE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2057: - Component/s: Diagnosability eliminate parser warning for Identifier DOT Identifier Key: HIVE-2057 URL: https://issues.apache.org/jira/browse/HIVE-2057 Project: Hive Issue Type: Improvement Components: Diagnosability, Query Processor Reporter: John Sichi I noticed this warning in recent builds: {noformat} build-grammar: [echo] Building Grammar /data/users/jsichi/open/hive-trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g [java] ANTLR Parser Generator Version 3.0.1 (August 13, 2007) 1989-2007 [java] warning(200): /data/users/jsichi/open/hive-trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g:1503:5: Decision can match input such as Identifier DOT Identifier using multiple alternatives: 1, 2 [java] As a result, alternative(s) 2 were disabled for that input {noformat} This was introduced by HIVE-1517. Is there a way to get rid of it? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1935) set hive.security.authorization.createtable.owner.grants to null by default
[ https://issues.apache.org/jira/browse/HIVE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1935: - Component/s: Security set hive.security.authorization.createtable.owner.grants to null by default --- Key: HIVE-1935 URL: https://issues.apache.org/jira/browse/HIVE-1935 Project: Hive Issue Type: Bug Components: Security Reporter: He Yongqiang Assignee: He Yongqiang Fix For: 0.7.0 Attachments: HIVE-1935.1.patch It seems an empty setting in hive-size.xml does not overwrite hive-default.xml -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-1935) set hive.security.authorization.createtable.owner.grants to null by default
[ https://issues.apache.org/jira/browse/HIVE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-1935. -- Resolution: Fixed Fix Version/s: 0.7.0 Hadoop Flags: [Reviewed] set hive.security.authorization.createtable.owner.grants to null by default --- Key: HIVE-1935 URL: https://issues.apache.org/jira/browse/HIVE-1935 Project: Hive Issue Type: Bug Components: Security Reporter: He Yongqiang Assignee: He Yongqiang Fix For: 0.7.0 Attachments: HIVE-1935.1.patch It seems an empty setting in hive-size.xml does not overwrite hive-default.xml -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml
[ https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1841: - Component/s: Metastore datanucleus.fixedDatastore should be true in hive-default.xml -- Key: HIVE-1841 URL: https://issues.apache.org/jira/browse/HIVE-1841 Project: Hive Issue Type: Improvement Components: Configuration, Metastore Affects Versions: 0.6.0 Reporter: Edward Capriolo Priority: Minor Attachments: HIVE-1841.1.patch.txt Two datanucleus variables: {noformat} property namedatanucleus.autoCreateSchema/name valuefalse/value /property property namedatanucleus.fixedDatastore/name valuetrue/value /property {noformat} are dangerous. We do want the schema to auto-create itself, but we do not want the schema to auto update itself. Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1825) Different defaults for hive.metastore.local
[ https://issues.apache.org/jira/browse/HIVE-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1825: - Component/s: Metastore Different defaults for hive.metastore.local --- Key: HIVE-1825 URL: https://issues.apache.org/jira/browse/HIVE-1825 Project: Hive Issue Type: Bug Components: Configuration, Metastore Affects Versions: 0.6.0 Reporter: Lars Francke hive-default.xml sets {{hive.metastore.local}} to {{true}}. In the code however there is this: {code:title=HiveMetaStoreClient.java} boolean localMetaStore = conf.getBoolean(hive.metastore.local, false); {code} This leads to different behaviour depending on whether hbase-default.xml is on the classpath or not.which can lead to some confusion ;-) I can supply a patch - should be pretty similar. I just don't know what the real default should be. My guess would be {{true}}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1875) On job failure log some messages explaining that Hive is retrieving task completion events
[ https://issues.apache.org/jira/browse/HIVE-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1875: - Component/s: Diagnosability On job failure log some messages explaining that Hive is retrieving task completion events -- Key: HIVE-1875 URL: https://issues.apache.org/jira/browse/HIVE-1875 Project: Hive Issue Type: Improvement Components: Diagnosability, Query Processor Reporter: Carl Steinbach If a job fails, Hive currently displays a link to the task with the most number of failures for easy access to the error logs. However, generating the link may require many RPC's to get all the task completion events, adding a delay of up to 30 minutes. HIVE-1578 added a configuration property that allows the user to disable this behavior. This ticket covers adding some logging statements notifying the user that HIve is retrieving this information. This intended to avoid giving the user the impression that the CLI has simply locked up. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1095) Hive in Maven
[ https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016146#comment-13016146 ] Ning Zhang commented on HIVE-1095: -- I tried the first command: ant make-maven -Dversion=0.8.0-SNAPSHOT -logfile make-maven.log and it seems succeeded. I'll attached make-maven.log. It would be nice that someone has the knowledge can take a look and see if it is correct. I haven't not run the other command to publish maven yet. I can run that as long as I get a +1 from committers who has the knowledge. Hive in Maven - Key: HIVE-1095 URL: https://issues.apache.org/jira/browse/HIVE-1095 Project: Hive Issue Type: Task Components: Build Infrastructure Affects Versions: 0.6.0 Reporter: Gerrit Jansen van Vuuren Priority: Minor Attachments: HIVE-1095-trunk.patch, HIVE-1095.v2.PATCH, HIVE-1095.v3.PATCH, HIVE-1095.v4.PATCH, HIVE-1095.v5.PATCH, hiveReleasedToMaven.tar.gz Getting hive into maven main repositories Documentation on how to do this is on: http://maven.apache.org/guides/mini/guide-central-repository-upload.html -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1095) Hive in Maven
[ https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-1095: - Attachment: make-maven.log Hive in Maven - Key: HIVE-1095 URL: https://issues.apache.org/jira/browse/HIVE-1095 Project: Hive Issue Type: Task Components: Build Infrastructure Affects Versions: 0.6.0 Reporter: Gerrit Jansen van Vuuren Priority: Minor Attachments: HIVE-1095-trunk.patch, HIVE-1095.v2.PATCH, HIVE-1095.v3.PATCH, HIVE-1095.v4.PATCH, HIVE-1095.v5.PATCH, hiveReleasedToMaven.tar.gz, make-maven.log Getting hive into maven main repositories Documentation on how to do this is on: http://maven.apache.org/guides/mini/guide-central-repository-upload.html -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1301) RAND() should be RAND_UNIF(); also, we should create RAND_NORM() and add options
[ https://issues.apache.org/jira/browse/HIVE-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1301: - Component/s: UDF RAND() should be RAND_UNIF(); also, we should create RAND_NORM() and add options Key: HIVE-1301 URL: https://issues.apache.org/jira/browse/HIVE-1301 Project: Hive Issue Type: Wish Components: UDF Reporter: Adam Kramer Assignee: Paul Yang The generation of pseudorandom data is very useful, but would be even MORE useful if we had a few levers to pull. Currently, RAND() generates a random number pulled from a uniform distribution between 0 and 1. It would be great if we could user-specify the min and max because that is a more elegant way to write code: RAND()*200+50 will generate the same thing as RAND_UNIF(min=50,max=250) but the latter is a much better way to express this in a readable manner. Similarly, it would be useful to have non-uniform random data for statistical purposes. RAND_NORM(mean=0,sd=1) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1262) Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt
[ https://issues.apache.org/jira/browse/HIVE-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1262: - Component/s: UDF Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt - Key: HIVE-1262 URL: https://issues.apache.org/jira/browse/HIVE-1262 Project: Hive Issue Type: New Feature Components: UDF Affects Versions: 0.6.0 Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: hive-1262-1.patch.txt Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1360) Allow UDFs to access constant parameter values at compile time
[ https://issues.apache.org/jira/browse/HIVE-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1360: - Component/s: UDF Allow UDFs to access constant parameter values at compile time -- Key: HIVE-1360 URL: https://issues.apache.org/jira/browse/HIVE-1360 Project: Hive Issue Type: Improvement Components: Query Processor, UDF Affects Versions: 0.5.0 Reporter: Carl Steinbach Assignee: Carl Steinbach UDFs should be able to access constant parameter values at compile time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1384) HiveServer should run as the user who submitted the query.
[ https://issues.apache.org/jira/browse/HIVE-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1384: - Component/s: Security HiveServer should run as the user who submitted the query. -- Key: HIVE-1384 URL: https://issues.apache.org/jira/browse/HIVE-1384 Project: Hive Issue Type: Improvement Components: Metastore, Security, Server Infrastructure Reporter: He Yongqiang Assignee: He Yongqiang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1343) add an interface in RCFile to support concatenation of two files without (de)compression
[ https://issues.apache.org/jira/browse/HIVE-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1343: - Component/s: Serializers/Deserializers add an interface in RCFile to support concatenation of two files without (de)compression Key: HIVE-1343 URL: https://issues.apache.org/jira/browse/HIVE-1343 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Affects Versions: 0.6.0 Reporter: Ning Zhang Assignee: He Yongqiang Attachments: HIVE-1343.1.patch If two files are concatenated, we need to read each record in these files and write them back to the destination file. The IO cost is mostly unavoidable due to the lack of append functionality in HDFS. However the CPU cost could be significantly reduced by avoiding compression and decompression of the files. The File Format layer should provide API that implement the block-level concatenation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1480) CREATE TABLE IF NOT EXISTS get incorrect table name
[ https://issues.apache.org/jira/browse/HIVE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1480: - Component/s: Query Processor CREATE TABLE IF NOT EXISTS get incorrect table name --- Key: HIVE-1480 URL: https://issues.apache.org/jira/browse/HIVE-1480 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Ning Zhang Assignee: Ning Zhang CREATE TABLE IF NOT EXISTS T AS SELECT ... gives the following error after the job succeeded: Setting total progress to 100 10/07/22 11:26:14 INFO exec.ExecDriver: Ended Job = job_201006221843_688872 10/07/22 11:26:14 INFO exec.FileSinkOperator: Moving tmp dir: hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/_tmp.10001 to: hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/_tmp.10001.intermediate 10/07/22 11:26:14 INFO exec.FileSinkOperator: Moving tmp dir: hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/_tmp.10001.intermediate to: hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/10001 Moving data to: hdfs://dfstmp.data.facebook.com:9000/user/facebook/warehouse/ericm_budget_email_actua43 10/07/22 11:26:15 INFO exec.MoveTask: Moving data to: hdfs://dfstmp.data.facebook.com:9000/user/facebook/warehouse/ericm_budget_email_actua43 from hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/10001 10/07/22 11:26:15 WARN hdfs.DFSClient: File /user/facebook/warehouse/ericm_budget_email_actua43 is beng deleted only through Trash org.apache.hadoop.fs.FsShell.delete because all deletes must go through Trash. 10/07/22 11:26:15 INFO hive.log: DDL: struct ericm_budget_email_actua43 { string acct_id, string first_name, string email, string campaign_name_list} 10/07/22 11:26:15 INFO metastore.HiveMetaStore: 0: create_table: db=default tbl=ericm_budget_email_actua43 10/07/22 11:26:15 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=ericm_budget_email_actua43 10/07/22 11:26:15 INFO hooks.HookUtils: Host:cdb067.snc1.facebook.com database:audit_silver 10/07/22 11:26:15 INFO hooks.HookUtils: Host:cdb067.snc1.facebook.com database:lineage_silver 10/07/22 11:26:15 INFO hooks.HookUtils: rows inserted: 1 sql: insert into snc1_command_log set command = ?, command_type = ?, inputs = ?, outputs = ?, queryId = ?, user_info = ? OK 10/07/22 11:26:15 INFO ql.Driver: OK 10/07/22 11:26:16 INFO ql.Context: getStream error: java.io.FileNotFoundException: File does not exist: hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/1 at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:457) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:294) at org.apache.hadoop.hive.ql.Context.getStream(Context.java:386) at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:688) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:146) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:294) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Time taken: 361.26 seconds 10/07/22 11:26:16 INFO CliDriver: Time taken: 361.26 seconds Exit code: 0, 0 dus: Cannot access /user/facebook/warehouse/IF: No such file or directory. tablesize cmd:/mnt/vol/hive/sites/silver.trunk/hadoop/bin/hadoop dfs -dus /user/facebook/warehouse/IF | cut -d$'\t' -f2 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1625) Added implementation to HivePreparedStatement, HiveBaseResultSet and HiveQueryResultSet.
[ https://issues.apache.org/jira/browse/HIVE-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1625: - Component/s: JDBC Added implementation to HivePreparedStatement, HiveBaseResultSet and HiveQueryResultSet. Key: HIVE-1625 URL: https://issues.apache.org/jira/browse/HIVE-1625 Project: Hive Issue Type: Improvement Components: JDBC Reporter: Sean Flatley Assignee: Sean Flatley Attachments: HIVE-1625.patch, changelog.txt, testJdbcDriver.log We implemented several of the HivePreparedStatement set methods, such as setString(int, String) and the means to substitute place holders in the SQL with the values set. HiveQueryResultSet and HiveBaseResultSet were enhanced so that getStatement() could be implemented. See attached change log for details. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1665) drop operations may cause file leak
[ https://issues.apache.org/jira/browse/HIVE-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1665: - Component/s: Metastore drop operations may cause file leak --- Key: HIVE-1665 URL: https://issues.apache.org/jira/browse/HIVE-1665 Project: Hive Issue Type: Bug Components: Metastore Reporter: He Yongqiang Assignee: He Yongqiang Attachments: hive-1665.1.patch Right now when doing a drop, Hive first drops metadata and then drops the actual files. If file system is down at that time, the files will keep not deleted. Had an offline discussion about this: to fix this, add a new conf scratch dir into hive conf. when doing a drop operation: 1) move data to scratch directory 2) drop metadata 3) if 2) failed, roll back 1) and report error 3.1 if 2) succeeded, drop data from scratch directory 3.2 4) if 3.2 fails, we are ok because we assume the scratch dir will be emptied manually. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1666) retry metadata operation in case of an failure
[ https://issues.apache.org/jira/browse/HIVE-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1666: - Component/s: Metastore retry metadata operation in case of an failure -- Key: HIVE-1666 URL: https://issues.apache.org/jira/browse/HIVE-1666 Project: Hive Issue Type: Improvement Components: Metastore, Query Processor Reporter: Namit Jain Assignee: Paul Yang If a user is trying to insert into a partition, insert overwrite table T partition (p) select .. it is possible that the directory gets created, but the metadata creation of T@p fails - currently, we will just throw an error. The final directory has been created. It will be useful to at-least retry the metadata operation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1667) Store the group of the owner of the table in metastore
[ https://issues.apache.org/jira/browse/HIVE-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1667: - Component/s: Security Store the group of the owner of the table in metastore -- Key: HIVE-1667 URL: https://issues.apache.org/jira/browse/HIVE-1667 Project: Hive Issue Type: New Feature Components: Security Reporter: Namit Jain Attachments: hive-1667.patch Currently, the group of the owner of the table is not stored in the metastore. Secondly, if you create a table, the table's owner group is set to the group for the parent. It is not read from the UGI passed in. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1690) HivePreparedStatement.executeImmediate(String sql) is breaking the exception stack
[ https://issues.apache.org/jira/browse/HIVE-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1690: - Component/s: JDBC HivePreparedStatement.executeImmediate(String sql) is breaking the exception stack -- Key: HIVE-1690 URL: https://issues.apache.org/jira/browse/HIVE-1690 Project: Hive Issue Type: Improvement Components: JDBC Reporter: Eli Griv Priority: Minor in HivePreparedStatement.executeImmediate(String sql), the exception stack is broken, so it's impossible to know which method throwed Method not supported FIX : HivePreparedStatement.java L166 - throw new SQLException(e.getMessage(), e.getSQLState(), e.getErrorCode()); + throw new SQLException(e.getMessage(), e.getSQLState(), e.getErrorCode(), e); L168 - throw new SQLException(ex.toString(), 08S01); + throw new SQLException(ex.toString(), 08S01, ex); -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1071) Making RCFile concatenatable to reduce the number of files of the output
[ https://issues.apache.org/jira/browse/HIVE-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1071: - Component/s: Serializers/Deserializers Making RCFile concatenatable to reduce the number of files of the output -- Key: HIVE-1071 URL: https://issues.apache.org/jira/browse/HIVE-1071 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Reporter: Zheng Shao Hive automatically determine the number of reducers most of the time. Sometimes, we create a lot of small files. Hive has an option to merge those small files though a map-reduce job. Dhruba has the idea which can fix it even faster: if we can make RCFile concatenatable, then we can simply tell the namenode to merge these files. Pros: This approach does not do any I/O so it's faster. Cons: We have to zero-fill the files to make sure they can be concatenated (all blocks except the last have to be full HDFS blocks). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1189) Add package-info.java to Hive
[ https://issues.apache.org/jira/browse/HIVE-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1189: - Component/s: Diagnosability Add package-info.java to Hive - Key: HIVE-1189 URL: https://issues.apache.org/jira/browse/HIVE-1189 Project: Hive Issue Type: New Feature Components: Build Infrastructure, Diagnosability Affects Versions: 0.6.0 Reporter: Zheng Shao Assignee: Zheng Shao Attachments: HIVE-1189.1.patch Hadoop automatically generates build/src/org/apache/hadoop/package-info.java with information like this: {code} /* * Generated by src/saveVersion.sh */ @HadoopVersionAnnotation(version=0.20.2-dev, revision=826568, user=zshao, date=Sun Oct 18 17:46:56 PDT 2009, url=http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20;) package org.apache.hadoop; {code} Hive should do the same thing so that we can easily know the version of the code at runtime. This will help us identify whether we are still running the same version of Hive, if we serialize the plan and later continue the execution (See HIVE-1100). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1189) Add package-info.java to Hive
[ https://issues.apache.org/jira/browse/HIVE-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1189: - Component/s: Build Infrastructure Add package-info.java to Hive - Key: HIVE-1189 URL: https://issues.apache.org/jira/browse/HIVE-1189 Project: Hive Issue Type: New Feature Components: Build Infrastructure, Diagnosability Affects Versions: 0.6.0 Reporter: Zheng Shao Assignee: Zheng Shao Attachments: HIVE-1189.1.patch Hadoop automatically generates build/src/org/apache/hadoop/package-info.java with information like this: {code} /* * Generated by src/saveVersion.sh */ @HadoopVersionAnnotation(version=0.20.2-dev, revision=826568, user=zshao, date=Sun Oct 18 17:46:56 PDT 2009, url=http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20;) package org.apache.hadoop; {code} Hive should do the same thing so that we can easily know the version of the code at runtime. This will help us identify whether we are still running the same version of Hive, if we serialize the plan and later continue the execution (See HIVE-1100). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-613) Hive server fetch row incorrect NULL representation
[ https://issues.apache.org/jira/browse/HIVE-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-613: Component/s: Server Infrastructure Hive server fetch row incorrect NULL representation --- Key: HIVE-613 URL: https://issues.apache.org/jira/browse/HIVE-613 Project: Hive Issue Type: Bug Components: Server Infrastructure Reporter: Eric Hwang Priority: Minor The Hive server fetch function does not correctly serialize null fields in the returned rows. Regardless of the actual null format representation within the table, the Hive server fetch function will always return null fields as NULL,creating a potential conflict with the actual string NULL. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-627) Optimizer should only access RowSchema (and not RowResolver)
[ https://issues.apache.org/jira/browse/HIVE-627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-627: Component/s: Query Processor Optimizer should only access RowSchema (and not RowResolver) Key: HIVE-627 URL: https://issues.apache.org/jira/browse/HIVE-627 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Zheng Shao The column pruner is accessing RowResolver a lot of times, for things like reverseLookup, and get(alias, column). These are not necessary - we should not need to translate an internal name to (alias, column) and then translate back. We should be able to use internal name from one operator to the other, using RowSchema. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-660) Fix UDFLike for multi-line inputs
[ https://issues.apache.org/jira/browse/HIVE-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-660: Component/s: UDF Fix UDFLike for multi-line inputs - Key: HIVE-660 URL: https://issues.apache.org/jira/browse/HIVE-660 Project: Hive Issue Type: Bug Components: UDF Reporter: Zheng Shao Assignee: Zheng Shao Attachments: HIVE-660.1.patch We should use DOTALL option in UDFLike, because '%' and '_' should also match to the newline. See http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html#DOTALL -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-664) optimize UDF split
[ https://issues.apache.org/jira/browse/HIVE-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-664: Component/s: (was: Query Processor) UDF Labels: optimization (was: ) optimize UDF split -- Key: HIVE-664 URL: https://issues.apache.org/jira/browse/HIVE-664 Project: Hive Issue Type: Bug Components: UDF Reporter: Namit Jain Labels: optimization Min Zhou added a comment - 21/Jul/09 07:34 AM It's very useful for us . some comments: 1. Can you implement it directly with Text ? Avoiding string decoding and encoding would be faster. Of course that trick may lead to another problem, as String.split uses a regular expression for splitting. 2. getDisplayString() always return a string in lowercase. [ Show » ] Min Zhou added a comment - 21/Jul/09 07:34 AM It's very useful for us . some comments: 1. Can you implement it directly with Text ? Avoiding string decoding and encoding would be faster. Of course that trick may lead to another problem, as String.split uses a regular expression for splitting. 2. getDisplayString() always return a string in lowercase. [ Permlink | « Hide ] Namit Jain added a comment - 21/Jul/09 09:22 AM Committed. Thanks Emil [ Show » ] Namit Jain added a comment - 21/Jul/09 09:22 AM Committed. Thanks Emil [ Permlink | « Hide ] Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM There are some easy (compromise) ways to optimize split: 1. Check if the regex argument actually contains some regex specific characters and if it doesn't, do a straightforward split without converting to strings. 2. Assume some default value for the second argument (for example - split(str) to be equivalent to split(str, ' ') and optimize for this value 3. Have two separate split functions - one that does regex and one that splits around plain text. I think that 1 is a good choice and can be done rather quickly. [ Show » ] Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM There are some easy (compromise) ways to optimize split: 1. Check if the regex argument actually contains some regex specific characters and if it doesn't, do a straightforward split without converting to strings. 2. Assume some default value for the second argument (for example - split(str) to be equivalent to split(str, ' ') and optimize for this value 3. Have two separate split functions - one that does regex and one that splits around plain text. I think that 1 is a good choice and can be done rather quickly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-538) make hive_jdbc.jar self-containing
[ https://issues.apache.org/jira/browse/HIVE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-538: Component/s: (was: Clients) JDBC make hive_jdbc.jar self-containing -- Key: HIVE-538 URL: https://issues.apache.org/jira/browse/HIVE-538 Project: Hive Issue Type: Improvement Components: JDBC Affects Versions: 0.3.0, 0.4.0, 0.6.0 Reporter: Raghotham Murthy Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are required in the classpath to run jdbc applications on hive. We need to do atleast the following to get rid of most unnecessary dependencies: 1. get rid of dynamic serde and use a standard serialization format, maybe tab separated, json or avro 2. dont use hadoop configuration parameters 3. repackage thrift and fb303 classes into hive_jdbc.jar -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-663) column aliases should be supported
[ https://issues.apache.org/jira/browse/HIVE-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-663. - Resolution: Duplicate Hadoop Flags: [Reviewed] This was fixed a long time ago in some other ticket. column aliases should be supported -- Key: HIVE-663 URL: https://issues.apache.org/jira/browse/HIVE-663 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain select key as x from src where x 10; should work -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-535) Memory-efficient hash-based Aggregation
[ https://issues.apache.org/jira/browse/HIVE-535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-535: Component/s: Query Processor Labels: optimization (was: ) Memory-efficient hash-based Aggregation --- Key: HIVE-535 URL: https://issues.apache.org/jira/browse/HIVE-535 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.4.0 Reporter: Zheng Shao Labels: optimization Currently there are a lot of memory overhead in the hash-based aggregation in GroupByOperator. The net result is that GroupByOperator won't be able to store many entries in its HashTable, and flushes frequently, and won't be able to achieve very good partial aggregation result. Here are some initial thoughts (some of them are from Joydeep long time ago): A1. Serialize the key of the HashTable. This will eliminate the 16-byte per-object overhead of Java in keys (depending on how many objects there are in the key, the saving can be substantial). A2. Use more memory-efficient hash tables - java.util.HashMap has about 64 bytes of overhead per entry. A3. Use primitive array to store aggregation results. Basically, the UDAF should manage the array of aggregation results, so UDAFCount should manage a long[], UDAFAvg should manage a double[] and a long[]. The external code should pass an index to iterate/merge/terminal an aggregation result. This will eliminate the 16-byte per-object overhead of Java. More ideas are welcome. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-508) Better error message for UDF parameter handling
[ https://issues.apache.org/jira/browse/HIVE-508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-508: Component/s: UDF Diagnosability Better error message for UDF parameter handling --- Key: HIVE-508 URL: https://issues.apache.org/jira/browse/HIVE-508 Project: Hive Issue Type: Bug Components: Diagnosability, UDF Reporter: Zheng Shao {code} CREATE TABLE x (a mapstring,string); SELECT round(a) FROM x; {code} This will show an error message: FAILED: Unknown exception : org.apache.hadoop.hive.serde2.typeinfo.MapTypeInfo cannot be cast to org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo We need better error messsage like: FAILED: Unable to pass a (type: mapstring,string) to function round. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-475) Lines exceeding mapred.linerecordreader.maxlength should cause exceptions
[ https://issues.apache.org/jira/browse/HIVE-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-475: Component/s: Diagnosability Lines exceeding mapred.linerecordreader.maxlength should cause exceptions - Key: HIVE-475 URL: https://issues.apache.org/jira/browse/HIVE-475 Project: Hive Issue Type: Improvement Components: Diagnosability, Serializers/Deserializers Reporter: S. Alex Smith Currently, rows of data that exceed mapred.linerecordreader.maxlength vanish silently. Instead, an option should be added to indicate what to do under this circumstance (vanish the entire line, truncate after max length, or fail the job), but the default behavior should be job failure. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-71) log details on rows that cause hive exceptions
[ https://issues.apache.org/jira/browse/HIVE-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-71: --- Component/s: Serializers/Deserializers Query Processor Diagnosability log details on rows that cause hive exceptions -- Key: HIVE-71 URL: https://issues.apache.org/jira/browse/HIVE-71 Project: Hive Issue Type: Bug Components: Diagnosability, Logging, Query Processor, Serializers/Deserializers Reporter: Joydeep Sen Sarma users are logging all rows in order to find out the row that's causing exceptions. instead we should just log as much information as possible on the row that causes exception in hive stack -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1362) column level statistics
[ https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1362: - Component/s: Statistics column level statistics --- Key: HIVE-1362 URL: https://issues.apache.org/jira/browse/HIVE-1362 Project: Hive Issue Type: Sub-task Components: Statistics Reporter: Ning Zhang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1361) table/partition level statistics
[ https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1361: - Component/s: Statistics table/partition level statistics Key: HIVE-1361 URL: https://issues.apache.org/jira/browse/HIVE-1361 Project: Hive Issue Type: Sub-task Components: Query Processor, Statistics Reporter: Ning Zhang Fix For: 0.7.0 Attachments: HIVE-1361.2.patch, HIVE-1361.2_java_only.patch, HIVE-1361.3.patch, HIVE-1361.4.java_only.patch, HIVE-1361.4.patch, HIVE-1361.5.java_only.patch, HIVE-1361.5.patch, HIVE-1361.java_only.patch, HIVE-1361.patch, stats0.patch At the first step, we gather table-level stats for non-partitioned table and partition-level stats for partitioned table. Future work could extend the table level stats to partitioned table as well. There are 3 major milestones in this subtask: 1) extend the insert statement to gather table/partition level stats on-the-fly. 2) extend metastore API to support storing and retrieving stats for a particular table/partition. 3) add an ANALYZE TABLE [PARTITION] statement in Hive QL to gather stats for existing tables/partitions. The proposed stats are: Partition-level stats: - number of rows - total size in bytes - number of files - max, min, average row sizes - max, min, average file sizes Table-level stats in addition to partition level stats: - number of partitions -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1940) Query Optimization Using Column Metadata and Histograms
[ https://issues.apache.org/jira/browse/HIVE-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1940: - Component/s: Statistics Query Optimization Using Column Metadata and Histograms --- Key: HIVE-1940 URL: https://issues.apache.org/jira/browse/HIVE-1940 Project: Hive Issue Type: New Feature Components: Metastore, Query Processor, Statistics Reporter: Anja Gruenheid Attachments: HiveMetaStore.pdf The current basis for cost-based query optimization in Hive is information gathered on tables and partitions. To make further improvements in query optimization possible, the next step is to develop and implement possibilities to gather information on columns as discussed in issue HIVE-33. After that, an implementation of histograms is a possible option to use and collect run-time statistics. Next to the actual implementation of these features, it is also necessary to develop a consistent storage model for the MetaStore. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-109) 'location' clause for table creation should only be allowed for external tables
[ https://issues.apache.org/jira/browse/HIVE-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-109: Component/s: Metastore 'location' clause for table creation should only be allowed for external tables --- Key: HIVE-109 URL: https://issues.apache.org/jira/browse/HIVE-109 Project: Hive Issue Type: Bug Components: Metastore, Query Processor Reporter: Joydeep Sen Sarma currently - the code does not by and large distinguish between external and internal tables. one clear distinction though is that storage for external tables is managed outside hive. this leads to consequences like HIVE-86 - so that hive does not mess around with tables whose storage is managed externally. however - currently - we allow users to specify location for internal tables - which is confusing and could lead to data being deleted in external folders. we should not allow this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-55) restrict table and column names to be alphanumeric and _ characters
[ https://issues.apache.org/jira/browse/HIVE-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-55: --- Component/s: Metastore restrict table and column names to be alphanumeric and _ characters --- Key: HIVE-55 URL: https://issues.apache.org/jira/browse/HIVE-55 Project: Hive Issue Type: Bug Components: Metastore, Query Processor Reporter: Prasad Chakka currently the DDL will restrict to alpha-numeric and _ chars but not if the tables were created or altered using metastore clients directly. this JIRA aims to fix that. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-80) Add testcases for concurrent query execution
[ https://issues.apache.org/jira/browse/HIVE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-80: --- Component/s: Server Infrastructure Labels: concurrency (was: ) Add testcases for concurrent query execution Key: HIVE-80 URL: https://issues.apache.org/jira/browse/HIVE-80 Project: Hive Issue Type: Test Components: Query Processor, Server Infrastructure Reporter: Raghotham Murthy Assignee: Arvind Prabhakar Priority: Critical Labels: concurrency Attachments: hive_input_format_race-2.patch Can use one driver object per query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-149) Aggregate functions MIN and MAX should support all types
[ https://issues.apache.org/jira/browse/HIVE-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-149: Component/s: (was: Query Processor) UDF Aggregate functions MIN and MAX should support all types Key: HIVE-149 URL: https://issues.apache.org/jira/browse/HIVE-149 Project: Hive Issue Type: Improvement Components: UDF Reporter: YihueyChyi Assignee: David Phillips Priority: Critical -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-156) Allow != in place of
[ https://issues.apache.org/jira/browse/HIVE-156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-156. - Resolution: Duplicate Fixed in HIVE-899. Allow != in place of --- Key: HIVE-156 URL: https://issues.apache.org/jira/browse/HIVE-156 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: S. Alex Smith Priority: Trivial I'm used to using != for inequality. It would be nice if Hive supported this as an alternative to . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-191) Update methods in Hive class to specify database name
[ https://issues.apache.org/jira/browse/HIVE-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-191. - Resolution: Duplicate Fixed in HIVE-675. Update methods in Hive class to specify database name - Key: HIVE-191 URL: https://issues.apache.org/jira/browse/HIVE-191 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Johan Oskarsson Priority: Minor In the query processor module there is a Hive class used to access various Metastore data. Unfortunately most of those methods only work on the default database. We should update them to work on other databases as well by adding a database name parameter. See HIVE-182 for more background information. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-293) report deserialize exceptions from serde's via exceptions
[ https://issues.apache.org/jira/browse/HIVE-293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-293: Component/s: Diagnosability report deserialize exceptions from serde's via exceptions - Key: HIVE-293 URL: https://issues.apache.org/jira/browse/HIVE-293 Project: Hive Issue Type: Bug Components: Diagnosability, Serializers/Deserializers Reporter: Joydeep Sen Sarma lazyserde and dynamicserde should report exceptions on number (and other) parsing errors so higher layers can take the correct action -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-301) Ability to store row counts (and other stats) in metastore and obtain them via queries
[ https://issues.apache.org/jira/browse/HIVE-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-301. - Resolution: Duplicate I think this was covered by the recent stats work. Ability to store row counts (and other stats) in metastore and obtain them via queries -- Key: HIVE-301 URL: https://issues.apache.org/jira/browse/HIVE-301 Project: Hive Issue Type: New Feature Reporter: Joydeep Sen Sarma now that we have insertion row counts being bubbled out of the execution path - it would be good to stash them away in the metastore. It would also be good to have them be viewable by some simple command (like the mysql status commands - but perhaps we have something we could re-use already). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-305) Port Hadoop streaming's counters/status reporters to Hive Transforms
[ https://issues.apache.org/jira/browse/HIVE-305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-305: Component/s: Query Processor Port Hadoop streaming's counters/status reporters to Hive Transforms Key: HIVE-305 URL: https://issues.apache.org/jira/browse/HIVE-305 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Venky Iyer https://issues.apache.org/jira/browse/HADOOP-1328 Introduced a way for a streaming process to update global counters and status using stderr stream to emit information. Use reporter:counter:group,counter,amount to update a counter. Use reporter:status:message to update status. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-345) Extend Date UDFs to support time zone and full specs as in MySQL
[ https://issues.apache.org/jira/browse/HIVE-345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-345: Component/s: (was: Query Processor) UDF Extend Date UDFs to support time zone and full specs as in MySQL Key: HIVE-345 URL: https://issues.apache.org/jira/browse/HIVE-345 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.3.0 Reporter: Zheng Shao Most of the Date UDF in Hive now are based on String instead of Date objects, and they have limited functionality compared with MySQL. http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_from-unixtime http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_date-add http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_date-sub http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_datediff We should make it fully compliant with what MySQL offers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-361) Support seeks in some Hive File Formats
[ https://issues.apache.org/jira/browse/HIVE-361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-361: Component/s: Serializers/Deserializers Support seeks in some Hive File Formats --- Key: HIVE-361 URL: https://issues.apache.org/jira/browse/HIVE-361 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Zheng Shao Seek support can be useful for a few applications: 1. Filter out a set of records quickly when the data is sorted on the filtering key; 2. Create a random sample out of a File. This might not be a short-term goal, but let's keep the discussions here so it does not get lost. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-357) Add order-sensitive and order-insensitive hashing aggregation functions (UDAF)
[ https://issues.apache.org/jira/browse/HIVE-357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-357: Component/s: UDF Add order-sensitive and order-insensitive hashing aggregation functions (UDAF) -- Key: HIVE-357 URL: https://issues.apache.org/jira/browse/HIVE-357 Project: Hive Issue Type: New Feature Components: Query Processor, UDF Reporter: Zheng Shao Assignee: Zheng Shao In order to test whether a new version of Hive produces exactly the same result as an order version, we usually want to run a bunch of big queries as well as small queries. It's hard to compare the result of big queries, but if we have a hashing aggregation function, we can just aggregate the result of big queries and compare the single number. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-364) Hive Operators should calculate the value of common expressions just once
[ https://issues.apache.org/jira/browse/HIVE-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-364: Component/s: (was: Serializers/Deserializers) Query Processor Hive Operators should calculate the value of common expressions just once - Key: HIVE-364 URL: https://issues.apache.org/jira/browse/HIVE-364 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Zheng Shao Currently, if we have t.a + t.b in 2 different expressions in the select clause / where clause, we are computing it twice. We should cache the value of the expression evaluation result to save CPU time. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-436) MIN and MAX should inherit type
[ https://issues.apache.org/jira/browse/HIVE-436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-436: Component/s: UDF MIN and MAX should inherit type --- Key: HIVE-436 URL: https://issues.apache.org/jira/browse/HIVE-436 Project: Hive Issue Type: Wish Components: UDF Reporter: Adam Kramer MIN and MAX functions currently return the DOUBLE type...but really, they should return the same type as the column they operate on. In some cases like SUM, it's possible that the result would overflow making DOUBLE more useful as it can drop digits and swap to scientific notation, but MIN and MAX by definition cannot have this problem because the answers are always represented in the column they are run across. Easy workaround: CAST all of my MINs and MAXes. It's just a wish. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-441) Convert field, index, AND, OR operators to GenericUDF
[ https://issues.apache.org/jira/browse/HIVE-441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-441: Component/s: UDF Convert field, index, AND, OR operators to GenericUDF - Key: HIVE-441 URL: https://issues.apache.org/jira/browse/HIVE-441 Project: Hive Issue Type: Improvement Components: UDF Reporter: Zheng Shao Assignee: Zheng Shao Once the GenericUDF framework is in, we should convert exprNodeFieldDesc, exprNodeIndexDesc to GenericUDF to simplify the code. We should also convert AND and OR to GenericUDF in order to take advantage of short-circuit evaluation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.7.0-h0.20 #67
See https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/67/changes Changes: [cws] HIVE-2054. Exception on windows when using the jdbc driver. 'IOException: The system cannot find the path specified' (Bennie Schut via cws) -- [...truncated 27327 lines...] [junit] Hive history file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104051701_762911921.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: CREATETABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: CREATETABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath 'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] POSTHOOK: query: load data local inpath 'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable [junit] PREHOOK: type: QUERY [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/hudson/hive_2011-04-05_17-01-26_771_2233068310906990849/-mr-1 [junit] Total MapReduce jobs = 1 [junit] Launching Job 1 out of 1 [junit] Number of reduce tasks determined at compile time: 1 [junit] In order to change the average load for a reducer (in bytes): [junit] set hive.exec.reducers.bytes.per.reducer=number [junit] In order to limit the maximum number of reducers: [junit] set hive.exec.reducers.max=number [junit] In order to set a constant number of reducers: [junit] set mapred.reduce.tasks=number [junit] Job running in-process (local Hadoop) [junit] 2011-04-05 17:01:29,841 null map = 100%, reduce = 100% [junit] Ended Job = job_local_0001 [junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable [junit] POSTHOOK: type: QUERY [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/hudson/hive_2011-04-05_17-01-26_771_2233068310906990849/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104051701_458747547.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: CREATETABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: CREATETABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath 'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: LOAD [junit] Copying data from https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] POSTHOOK: query: load data local inpath 'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select * from testhivedrivertable limit 10 [junit] PREHOOK: type: QUERY [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/hudson/hive_2011-04-05_17-01-31_397_1104374875420903108/-mr-1 [junit] POSTHOOK: query: select * from testhivedrivertable limit 10 [junit] POSTHOOK: type: QUERY [junit] POSTHOOK: Input: default@testhivedrivertable
[jira] [Created] (HIVE-2096) throw a error if the input is larger than a threshold for index input format
throw a error if the input is larger than a threshold for index input format Key: HIVE-2096 URL: https://issues.apache.org/jira/browse/HIVE-2096 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: He Yongqiang This can hang for ever. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2090) Add DROP DATABASE ... FORCE
[ https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016200#comment-13016200 ] He Yongqiang commented on HIVE-2090: can you add authorization check for drop database in this jira? Add DROP DATABASE ... FORCE - Key: HIVE-2090 URL: https://issues.apache.org/jira/browse/HIVE-2090 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch A DROP DATABASE ... FORCE will be useful, when we use a database for isolation when doing some tests. Being able to force cleaning up the database will make test cleaning up easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2090) Add DROP DATABASE ... FORCE
[ https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-2090: -- Attachment: HIVE-2090.3.patch Add DROP DATABASE ... FORCE - Key: HIVE-2090 URL: https://issues.apache.org/jira/browse/HIVE-2090 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch A DROP DATABASE ... FORCE will be useful, when we use a database for isolation when doing some tests. Being able to force cleaning up the database will make test cleaning up easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2090) Add DROP DATABASE ... FORCE
[ https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-2090: -- Attachment: (was: HIVE-2090.3.patch) Add DROP DATABASE ... FORCE - Key: HIVE-2090 URL: https://issues.apache.org/jira/browse/HIVE-2090 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch A DROP DATABASE ... FORCE will be useful, when we use a database for isolation when doing some tests. Being able to force cleaning up the database will make test cleaning up easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2090) Add DROP DATABASE ... FORCE
[ https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-2090: -- Attachment: HIVE-2090.3.patch Add DROP DATABASE ... FORCE - Key: HIVE-2090 URL: https://issues.apache.org/jira/browse/HIVE-2090 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch A DROP DATABASE ... FORCE will be useful, when we use a database for isolation when doing some tests. Being able to force cleaning up the database will make test cleaning up easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2090) Add DROP DATABASE ... FORCE
[ https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-2090: -- Status: Patch Available (was: Open) Moved the logic of dropping tables to ObjectStore level. The concurrency bug will be handled in separate JIRAs, HIVE-2093 and HIVE-2094. Add DROP DATABASE ... FORCE - Key: HIVE-2090 URL: https://issues.apache.org/jira/browse/HIVE-2090 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch A DROP DATABASE ... FORCE will be useful, when we use a database for isolation when doing some tests. Being able to force cleaning up the database will make test cleaning up easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2090) Add DROP DATABASE ... FORCE
[ https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016204#comment-13016204 ] Siying Dong commented on HIVE-2090: --- Yongqiang, adding authorization check for dropping and creating databases can take more efforts then this. I'll see whether it is easy to finish it in HIVE-2093 together with concurrency check. It doesn't sound belonging to this JIRA. Add DROP DATABASE ... FORCE - Key: HIVE-2090 URL: https://issues.apache.org/jira/browse/HIVE-2090 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch A DROP DATABASE ... FORCE will be useful, when we use a database for isolation when doing some tests. Being able to force cleaning up the database will make test cleaning up easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: HIVE-2090: Add DROP DATABASE ... FORCE
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/551/ --- Review request for hive. Summary --- https://issues.apache.org/jira/secure/attachment/12475548/HIVE-2090.3.patch This addresses bug HIVE-2090. https://issues.apache.org/jira/browse/HIVE-2090 Diffs - trunk/metastore/if/hive_metastore.thrift 1088810 trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1088810 trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1088810 trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 1088810 trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 1088810 trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 1088810 trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 1088810 trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 1088810 trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1088810 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1088810 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1088810 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1088810 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1088810 trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1088810 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1088810 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1088810 trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DropDatabaseDesc.java 1088810 trunk/ql/src/test/queries/clientpositive/database.q 1088810 trunk/ql/src/test/results/clientpositive/database.q.out 1088810 Diff: https://reviews.apache.org/r/551/diff Testing --- Thanks, Carl
[jira] [Commented] (HIVE-2090) Add DROP DATABASE ... FORCE
[ https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016205#comment-13016205 ] jirapos...@reviews.apache.org commented on HIVE-2090: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/551/ --- Review request for hive. Summary --- https://issues.apache.org/jira/secure/attachment/12475548/HIVE-2090.3.patch This addresses bug HIVE-2090. https://issues.apache.org/jira/browse/HIVE-2090 Diffs - trunk/metastore/if/hive_metastore.thrift 1088810 trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1088810 trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1088810 trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 1088810 trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 1088810 trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 1088810 trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 1088810 trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 1088810 trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1088810 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1088810 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1088810 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1088810 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1088810 trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1088810 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1088810 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1088810 trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DropDatabaseDesc.java 1088810 trunk/ql/src/test/queries/clientpositive/database.q 1088810 trunk/ql/src/test/results/clientpositive/database.q.out 1088810 Diff: https://reviews.apache.org/r/551/diff Testing --- Thanks, Carl Add DROP DATABASE ... FORCE - Key: HIVE-2090 URL: https://issues.apache.org/jira/browse/HIVE-2090 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch A DROP DATABASE ... FORCE will be useful, when we use a database for isolation when doing some tests. Being able to force cleaning up the database will make test cleaning up easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query select xx,xx from xxx LIMIT xxx if no filtering or aggregation
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-2068: -- Status: Patch Available (was: Open) Speed up query select xx,xx from xxx LIMIT xxx if no filtering or aggregation --- Key: HIVE-2068 URL: https://issues.apache.org/jira/browse/HIVE-2068 Project: Hive Issue Type: Improvement Reporter: Siying Dong Assignee: Siying Dong Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch Currently, select xx,xx from xxx where ...(only partition conditions) LIMIT xxx will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2068) Speed up query select xx,xx from xxx LIMIT xxx if no filtering or aggregation
[ https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-2068: -- Attachment: HIVE-2068.4.patch addressing Namit's comments. Speed up query select xx,xx from xxx LIMIT xxx if no filtering or aggregation --- Key: HIVE-2068 URL: https://issues.apache.org/jira/browse/HIVE-2068 Project: Hive Issue Type: Improvement Reporter: Siying Dong Assignee: Siying Dong Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, HIVE-2068.4.patch Currently, select xx,xx from xxx where ...(only partition conditions) LIMIT xxx will start a MapReduce job with input to be the whole table or partition. The latency can be huge if the table or partition is big. We could reduce number of input files to speed up the queries. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2090) Add DROP DATABASE ... FORCE
[ https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016209#comment-13016209 ] Siying Dong commented on HIVE-2090: --- One thing to notice. When dropping all tables on a database, all files on the warehouse root of the DB are deleted. Data from tables/partitions on locations that are not under that root won't be deleted. This is kind of similar to the current approach of dropping table -- data in partitions won't be deleted if their locations are not under table's locations. Add DROP DATABASE ... FORCE - Key: HIVE-2090 URL: https://issues.apache.org/jira/browse/HIVE-2090 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch A DROP DATABASE ... FORCE will be useful, when we use a database for isolation when doing some tests. Being able to force cleaning up the database will make test cleaning up easier. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive
[ https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-1803: - Status: Open (was: Patch Available) I'm getting test failures still. * TestMinimrCliDriver:join1 * TestMTQueries:testMTQueries1 * TestParse: 44/45 tests failed These all need fixes before commit. Implement bitmap indexing in Hive - Key: HIVE-1803 URL: https://issues.apache.org/jira/browse/HIVE-1803 Project: Hive Issue Type: New Feature Components: Indexing Reporter: Marquis Wang Assignee: Marquis Wang Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.patch Implement bitmap index handler to complement compact indexing. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user
[ https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016241#comment-13016241 ] jirapos...@reviews.apache.org commented on HIVE-1988: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/528/#review394 --- Ship it! +1 - Amareshwari On 2011-04-05 21:24:34, Devaraj Das wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/528/ bq. --- bq. bq. (Updated 2011-04-05 21:24:34) bq. bq. bq. Review request for hive. bq. bq. bq. Summary bq. --- bq. bq. Fixes to some security issues discussed in HIVE-1988 bq. bq. bq. This addresses bug HIVE-1988. bq. https://issues.apache.org/jira/browse/HIVE-1988 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java 1089155 bq. http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java 1089155 bq. bq. Diff: https://reviews.apache.org/r/528/diff bq. bq. bq. Testing bq. --- bq. bq. New unit test added and that passes. All unit tests passed. bq. bq. bq. Thanks, bq. bq. Devaraj bq. bq. Make the delegation token issued by the MetaStore owned by the right user - Key: HIVE-1988 URL: https://issues.apache.org/jira/browse/HIVE-1988 Project: Hive Issue Type: Bug Components: Metastore, Security, Server Infrastructure Affects Versions: 0.7.0 Reporter: Devaraj Das Assignee: Devaraj Das Fix For: 0.8.0 Attachments: hive-1988-3.patch, hive-1988.patch The 'owner' of any delegation token issued by the MetaStore is set to the requesting user. When a delegation token is asked by the user himself during a job submission, this is fine. However, in the case where the token is requested for by services (e.g., Oozie), on behalf of the user, the token's owner is set to the user the service is running as. Later on, when the token is used by a MapReduce task, the MetaStore treats the incoming request as coming from Oozie and does operations as Oozie. This means any new directory creations (e.g., create_table) on the hdfs by the MetaStore will end up with Oozie as the owner. Also, the MetaStore doesn't check whether a user asking for a token on behalf of some other user, is actually authorized to act on behalf of that other user. We should start using the ProxyUser authorization in the MetaStore (HADOOP-6510's APIs). -- This message is automatically generated by JIRA. For more
[jira] [Commented] (HIVE-1095) Hive in Maven
[ https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016242#comment-13016242 ] Amareshwari Sriramadasu commented on HIVE-1095: --- bq. I tried the first command: ant make-maven -Dversion=0.8.0-SNAPSHOT -logfile make-maven.log and it seems succeeded. It succeeded for me too, where as maven-publish failed with 401/authorization errors. bq. It would be nice that someone has the knowledge can take a look and see if it is correct. Giri/Carl, can you help here? Hive in Maven - Key: HIVE-1095 URL: https://issues.apache.org/jira/browse/HIVE-1095 Project: Hive Issue Type: Task Components: Build Infrastructure Affects Versions: 0.6.0 Reporter: Gerrit Jansen van Vuuren Priority: Minor Attachments: HIVE-1095-trunk.patch, HIVE-1095.v2.PATCH, HIVE-1095.v3.PATCH, HIVE-1095.v4.PATCH, HIVE-1095.v5.PATCH, hiveReleasedToMaven.tar.gz, make-maven.log Getting hive into maven main repositories Documentation on how to do this is on: http://maven.apache.org/guides/mini/guide-central-repository-upload.html -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-1988
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/528/#review394 --- Ship it! +1 - Amareshwari On 2011-04-05 21:24:34, Devaraj Das wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/528/ --- (Updated 2011-04-05 21:24:34) Review request for hive. Summary --- Fixes to some security issues discussed in HIVE-1988 This addresses bug HIVE-1988. https://issues.apache.org/jira/browse/HIVE-1988 Diffs - http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java 1089155 http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java 1089155 Diff: https://reviews.apache.org/r/528/diff Testing --- New unit test added and that passes. All unit tests passed. Thanks, Devaraj