[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user

2011-04-05 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015794#comment-13015794
 ] 

jirapos...@reviews.apache.org commented on HIVE-1988:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/#review386
---



http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/528/#comment734

HadoopShims.isSecureShimImpl() is not called anywhere else. Shall we remove 
it if not required anymore?



http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
https://reviews.apache.org/r/528/#comment735

Do you want to move this into setup(), as it is common in both testcases?



http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
https://reviews.apache.org/r/528/#comment736

code looks duplicated. Can it be refactored by passing group names to a 
method?


- Amareshwari


On 2011-03-29 10:26:38, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/528/
bq.  ---
bq.  
bq.  (Updated 2011-03-29 10:26:38)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fixes to some security issues discussed in HIVE-1988
bq.  
bq.  
bq.  This addresses bug HIVE-1988.
bq.  https://issues.apache.org/jira/browse/HIVE-1988
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
 1085623 
bq.  
bq.  Diff: https://reviews.apache.org/r/528/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  New unit test added and that passes. All unit tests passed.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Devaraj
bq.  
bq.



 Make the delegation token issued by the MetaStore owned by the right user
 -

 Key: HIVE-1988
 URL: https://issues.apache.org/jira/browse/HIVE-1988
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Security, Server Infrastructure
Affects Versions: 0.7.0
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.8.0

 Attachments: hive-1988-3.patch, hive-1988.patch


 The 'owner' of any delegation token issued by the MetaStore is set to the 
 requesting user. When a delegation token is asked by the user himself during 
 a job submission, this is fine. However, in the case where the token is 
 requested for by services (e.g., Oozie), on behalf of the user, the token's 
 owner is set to the user the service is running as. Later on, when the token 
 is used by a MapReduce task, the MetaStore treats the incoming request as 
 coming from Oozie and does operations as Oozie. This means any new directory 
 creations (e.g., create_table) on the hdfs by the MetaStore will end up with 
 Oozie as the owner.
 Also, the MetaStore doesn't check whether a user asking for a token on behalf 
 of some other user, is actually authorized to act on behalf of that other 
 user. We should start using the ProxyUser authorization in the MetaStore 
 (HADOOP-6510's APIs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user

2011-04-05 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015795#comment-13015795
 ] 

Amareshwari Sriramadasu commented on HIVE-1988:
---

Changes look good overall. I updated the review board with some minor comments. 
You can upload the next patch with generated code.

 Make the delegation token issued by the MetaStore owned by the right user
 -

 Key: HIVE-1988
 URL: https://issues.apache.org/jira/browse/HIVE-1988
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Security, Server Infrastructure
Affects Versions: 0.7.0
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.8.0

 Attachments: hive-1988-3.patch, hive-1988.patch


 The 'owner' of any delegation token issued by the MetaStore is set to the 
 requesting user. When a delegation token is asked by the user himself during 
 a job submission, this is fine. However, in the case where the token is 
 requested for by services (e.g., Oozie), on behalf of the user, the token's 
 owner is set to the user the service is running as. Later on, when the token 
 is used by a MapReduce task, the MetaStore treats the incoming request as 
 coming from Oozie and does operations as Oozie. This means any new directory 
 creations (e.g., create_table) on the hdfs by the MetaStore will end up with 
 Oozie as the owner.
 Also, the MetaStore doesn't check whether a user asking for a token on behalf 
 of some other user, is actually authorized to act on behalf of that other 
 user. We should start using the ProxyUser authorization in the MetaStore 
 (HADOOP-6510's APIs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2032) create database does not honour warehouse.dir in dbproperties

2011-04-05 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015818#comment-13015818
 ] 

Thiruvel Thirumoolan commented on HIVE-2032:


@Amareshwari

Post altering, new tables will be created under new location. Old tables' have 
the fully qualified location in metadata and they should continue to work as 
before.

The reasons I went with alter location are:

1. Allow migration to happen if one would like to reorganize or if quota runs 
out. Not sure how many folks have this situation.
2. One can migrate new tables to another DFS cluster (existing cluster becoming 
full).
3. One can migrate between file systems if they have sufficient use cases.

Do you think these are valid use cases?

 create database does not honour warehouse.dir in dbproperties
 -

 Key: HIVE-2032
 URL: https://issues.apache.org/jira/browse/HIVE-2032
 Project: Hive
  Issue Type: Bug
  Components: Clients
Affects Versions: 0.7.0, 0.8.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.8.0

 Attachments: DatabaseLocation.patch


 # create database db with dbproperties ('hive.metastore.warehouse.dir' = 
 'loc');
 The above command does not set location of 'db' to 'loc'. It instead creates 
 'db.db' under the warehouse directory configured in hive-site.xml of CLI. 
 Looks conflicting with HIVE-1820's expectation. If scratch dir is specified 
 here, that is honoured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2032) create database does not honour warehouse.dir in dbproperties

2011-04-05 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015925#comment-13015925
 ] 

Thiruvel Thirumoolan commented on HIVE-2032:


 Use cases make sense. But, drop database would remove tables only from new 
 location.

If I am not wrong, drop db succeeds only if all tables under it are dropped.

 Thiruvel, did you get a chance to test this? Because, your changes in patch 
 does not look complete. Changes should be propagated to PersistentManager 
 through ObjectStore.

Will take a look.

 create database does not honour warehouse.dir in dbproperties
 -

 Key: HIVE-2032
 URL: https://issues.apache.org/jira/browse/HIVE-2032
 Project: Hive
  Issue Type: Bug
  Components: Clients
Affects Versions: 0.7.0, 0.8.0
Reporter: Thiruvel Thirumoolan
Assignee: Thiruvel Thirumoolan
 Fix For: 0.8.0

 Attachments: DatabaseLocation.patch


 # create database db with dbproperties ('hive.metastore.warehouse.dir' = 
 'loc');
 The above command does not set location of 'db' to 'loc'. It instead creates 
 'db.db' under the warehouse directory configured in hive-site.xml of CLI. 
 Looks conflicting with HIVE-1820's expectation. If scratch dir is specified 
 here, that is honoured.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1538) FilterOperator is applied twice with ppd on.

2011-04-05 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13015943#comment-13015943
 ] 

jirapos...@reviews.apache.org commented on HIVE-1538:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/550/
---

Review request for hive, Yongqiang He and namit jain.


Summary
---

Patch updated to trunk with newly added configuration var 
hive.ppd.remove.duplicatefilters


This addresses bug HIVE-1538.
https://issues.apache.org/jira/browse/HIVE-1538


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1088944 
  trunk/contrib/src/test/results/clientpositive/dboutput.q.out 1088944 
  trunk/contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1088944 
  trunk/hbase-handler/src/test/results/hbase_pushdown.q.out 1088944 
  trunk/hbase-handler/src/test/results/hbase_queries.q.out 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 
1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpWalkerInfo.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java 
1088944 
  trunk/ql/src/test/queries/clientpositive/ppd1.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_clusterby.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_constant_expr.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby_join.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join3.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_multi_insert.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join1.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join3.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join4.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_random.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_transform.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_udf_case.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_union.q 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join0.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join11.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join12.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join13.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join14.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join16.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join19.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join20.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join21.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join23.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join27.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join4.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join5.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join6.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join7.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join8.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join9.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket3.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket4.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket_groupby.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/case_sensitivity.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/cast1.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/cluster.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/combine2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/create_view.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/disable_merge_for_bucketing.q.out 
1088944 
  

[jira] [Updated] (HIVE-1538) FilterOperator is applied twice with ppd on.

2011-04-05 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-1538:
--

Attachment: patch-1538-3.txt

Patch updated to trunk

 FilterOperator is applied twice with ppd on.
 

 Key: HIVE-1538
 URL: https://issues.apache.org/jira/browse/HIVE-1538
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Attachments: patch-1538-1.txt, patch-1538-2.txt, patch-1538-3.txt, 
 patch-1538.txt


 With hive.optimize.ppd set to true, FilterOperator is applied twice. And it 
 seems second operator is always filtering zero rows.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Query Regarding HIVE-1844.

2011-04-05 Thread Mohit
Hello He Yongqiang /All,

 

I was going through the defect HIVE-1844, but I couldn't able to reproduce
the scenario using Hive 0.5 version , though I  saw some OOM consistently
while Copy Task @ server side, but the client didn't hanged.

 

As per you what could have made client hanged? In my case,  Hive client was
able to get proper response from thrift whenever OOM occurred at Server
side.

like , java.sql.SQLException: org.apache.thrift.TApplicationException :
Internal error processing execute

 

Kindly provide me pointers on reproducing it. Do I need to do more
regression on it?

 

Just a thought/observation,

And as per code change , why the OOM was caught too early (that too in the
form of Throw able, which will eat other exception as well) ?

It would have been eventually caught by
ThriftHive$Processor$execute.process() and appropriate actions would have
been taken, so I was wondering how the code change helped preventing client
hang?

 

 

Thanks and Regards,

-Mohit



Review Request: HIVE-1538

2011-04-05 Thread Amareshwari Sriramadasu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/550/
---

Review request for hive, Yongqiang He and namit jain.


Summary
---

Patch updated to trunk with newly added configuration var 
hive.ppd.remove.duplicatefilters


This addresses bug HIVE-1538.
https://issues.apache.org/jira/browse/HIVE-1538


Diffs
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1088944 
  trunk/contrib/src/test/results/clientpositive/dboutput.q.out 1088944 
  trunk/contrib/src/test/results/clientpositive/serde_typedbytes4.q.out 1088944 
  trunk/hbase-handler/src/test/results/hbase_pushdown.q.out 1088944 
  trunk/hbase-handler/src/test/results/hbase_queries.q.out 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerInfo.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/ExprWalkerProcFactory.java 
1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpWalkerInfo.java 1088944 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/PredicatePushDown.java 
1088944 
  trunk/ql/src/test/queries/clientpositive/ppd1.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_clusterby.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_constant_expr.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_gby_join.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_join3.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_multi_insert.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join1.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join2.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join3.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_outer_join4.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_random.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_transform.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_udf_case.q 1088944 
  trunk/ql/src/test/queries/clientpositive/ppd_union.q 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join0.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join11.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join12.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join13.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join14.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join16.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join19.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join20.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join21.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join23.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join27.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join4.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join5.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join6.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join7.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join8.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/auto_join9.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket3.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket4.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucket_groupby.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/case_sensitivity.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/cast1.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/cluster.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/combine2.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/create_view.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/disable_merge_for_bucketing.q.out 
1088944 
  trunk/ql/src/test/results/clientpositive/filter_join_breaktask.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/groupby_map_ppr.q.out 1088944 
  trunk/ql/src/test/results/clientpositive/groupby_map_ppr_multi_distinct.q.out 
1088944 
  

[jira] [Created] (HIVE-2091) Test scripts need to be made deterministic in their output

2011-04-05 Thread Roman Shaposhnik (JIRA)
Test scripts need to be made deterministic in their output
--

 Key: HIVE-2091
 URL: https://issues.apache.org/jira/browse/HIVE-2091
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.7.0
Reporter: Roman Shaposhnik
Priority: Minor


Currently this 2 query scripts generate non-deterministic output:

The suggestion is to use GROUP BY statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output

2011-04-05 Thread Roman Shaposhnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated HIVE-2091:
---

Description: 
Currently this 2 query scripts generate non-deterministic output: 
  * ql/src/test/queries/clientpositive/rcfile_columnar.q
  * ql/src/test/queries/clientpositive/join_filters.q  

The suggestion is to use GROUP BY statement.

  was:
Currently this 2 query scripts generate non-deterministic output:

The suggestion is to use GROUP BY statement.

Summary: Test scripts rcfile_columnar.q and join_filters.q   need to be 
made deterministic in their output  (was: Test scripts need to be made 
deterministic in their output)

 Test scripts rcfile_columnar.q and join_filters.q   need to be made 
 deterministic in their output
 -

 Key: HIVE-2091
 URL: https://issues.apache.org/jira/browse/HIVE-2091
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.7.0
Reporter: Roman Shaposhnik
Priority: Minor

 Currently this 2 query scripts generate non-deterministic output: 
   * ql/src/test/queries/clientpositive/rcfile_columnar.q
   * ql/src/test/queries/clientpositive/join_filters.q  
 The suggestion is to use GROUP BY statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2068) Speed up query select xx,xx from xxx LIMIT xxx if no filtering or aggregation

2011-04-05 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-2068:
-

Status: Open  (was: Patch Available)

 Speed up query select xx,xx from xxx LIMIT xxx if no filtering or 
 aggregation
 ---

 Key: HIVE-2068
 URL: https://issues.apache.org/jira/browse/HIVE-2068
 Project: Hive
  Issue Type: Improvement
Reporter: Siying Dong
Assignee: Siying Dong
 Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch


 Currently, select xx,xx from xxx where ...(only partition conditions) LIMIT 
 xxx will start a MapReduce job with input to be the whole table or 
 partition. The latency can be huge if the table or partition is big. We could 
 reduce number of input files to speed up the queries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output

2011-04-05 Thread Roman Shaposhnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated HIVE-2091:
---

Status: Patch Available  (was: Open)

Please take a look at the attached patch

 Test scripts rcfile_columnar.q and join_filters.q   need to be made 
 deterministic in their output
 -

 Key: HIVE-2091
 URL: https://issues.apache.org/jira/browse/HIVE-2091
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.7.0
Reporter: Roman Shaposhnik
Priority: Minor

 Currently this 2 query scripts generate non-deterministic output: 
   * ql/src/test/queries/clientpositive/rcfile_columnar.q
   * ql/src/test/queries/clientpositive/join_filters.q  
 The suggestion is to use GROUP BY statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output

2011-04-05 Thread Roman Shaposhnik (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roman Shaposhnik updated HIVE-2091:
---

Attachment: HIVE-2091.patch

 Test scripts rcfile_columnar.q and join_filters.q   need to be made 
 deterministic in their output
 -

 Key: HIVE-2091
 URL: https://issues.apache.org/jira/browse/HIVE-2091
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.7.0
Reporter: Roman Shaposhnik
Priority: Minor
 Attachments: HIVE-2091.patch


 Currently this 2 query scripts generate non-deterministic output: 
   * ql/src/test/queries/clientpositive/rcfile_columnar.q
   * ql/src/test/queries/clientpositive/join_filters.q  
 The suggestion is to use GROUP BY statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2091) Test scripts rcfile_columnar.q and join_filters.q need to be made deterministic in their output

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2091:
-

Description: 
Currently this 2 query scripts generate non-deterministic output: 
  * ql/src/test/queries/clientpositive/rcfile_columnar.q
  * ql/src/test/queries/clientpositive/join_filters.q  

The suggestion is to use ORDER BY statement.

  was:
Currently this 2 query scripts generate non-deterministic output: 
  * ql/src/test/queries/clientpositive/rcfile_columnar.q
  * ql/src/test/queries/clientpositive/join_filters.q  

The suggestion is to use GROUP BY statement.


 Test scripts rcfile_columnar.q and join_filters.q   need to be made 
 deterministic in their output
 -

 Key: HIVE-2091
 URL: https://issues.apache.org/jira/browse/HIVE-2091
 Project: Hive
  Issue Type: Bug
  Components: Testing Infrastructure
Affects Versions: 0.7.0
Reporter: Roman Shaposhnik
Priority: Minor
 Attachments: HIVE-2091.patch


 Currently this 2 query scripts generate non-deterministic output: 
   * ql/src/test/queries/clientpositive/rcfile_columnar.q
   * ql/src/test/queries/clientpositive/join_filters.q  
 The suggestion is to use ORDER BY statement.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.7.0-h0.20 #66

2011-04-05 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/66/

--
[...truncated 27402 lines...]
[junit] Hive history 
file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104051246_596912665.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_12-46-50_400_7485125692539403914/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] 2011-04-05 12:46:53,444 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_12-46-50_400_7485125692539403914/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104051246_1722588719.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_12-46-55_145_3892655606022687079/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_12-46-55_145_3892655606022687079/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table 

[jira] [Resolved] (HIVE-2072) test

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-2072.
--

Resolution: Incomplete

 test
 

 Key: HIVE-2072
 URL: https://issues.apache.org/jira/browse/HIVE-2072
 Project: Hive
  Issue Type: Test
Reporter: YoungYik
Priority: Trivial



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2066) Metastore Schema Scripts

2011-04-05 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016098#comment-13016098
 ] 

Carl Steinbach commented on HIVE-2066:
--

This ticket is being used as a convenient public storage space for versioned 
dumps of the Hive MetaStore database schema.

 Metastore Schema Scripts
 

 Key: HIVE-2066
 URL: https://issues.apache.org/jira/browse/HIVE-2066
 Project: Hive
  Issue Type: Task
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: hive-schema-0.3.0.derby.sql, 
 hive-schema-0.3.0.mysql.sql, hive-schema-0.3.0.postgres.sql, 
 hive-schema-0.4.0.derby.sql, hive-schema-0.4.0.mysql.sql, 
 hive-schema-0.4.0.postgres.sql, hive-schema-0.4.1.derby.sql, 
 hive-schema-0.4.1.mysql.sql, hive-schema-0.4.1.postgres.sql, 
 hive-schema-0.5.0.derby.sql, hive-schema-0.5.0.mysql.sql, 
 hive-schema-0.5.0.postgres.sql, hive-schema-0.6.0.derby.sql, 
 hive-schema-0.6.0.mysql.sql, hive-schema-0.6.0.postgres.sql, 
 hive-schema-0.7.0.derby.sql, hive-schema-0.7.0.mysql.sql, 
 hive-schema-0.7.0.postgres.sql




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-2066) Metastore Schema Scripts

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-2066.
--

Resolution: Not A Problem

 Metastore Schema Scripts
 

 Key: HIVE-2066
 URL: https://issues.apache.org/jira/browse/HIVE-2066
 Project: Hive
  Issue Type: Task
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Attachments: hive-schema-0.3.0.derby.sql, 
 hive-schema-0.3.0.mysql.sql, hive-schema-0.3.0.postgres.sql, 
 hive-schema-0.4.0.derby.sql, hive-schema-0.4.0.mysql.sql, 
 hive-schema-0.4.0.postgres.sql, hive-schema-0.4.1.derby.sql, 
 hive-schema-0.4.1.mysql.sql, hive-schema-0.4.1.postgres.sql, 
 hive-schema-0.5.0.derby.sql, hive-schema-0.5.0.mysql.sql, 
 hive-schema-0.5.0.postgres.sql, hive-schema-0.6.0.derby.sql, 
 hive-schema-0.6.0.mysql.sql, hive-schema-0.6.0.postgres.sql, 
 hive-schema-0.7.0.derby.sql, hive-schema-0.7.0.mysql.sql, 
 hive-schema-0.7.0.postgres.sql




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-1668) Move HWI out to Github

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-1668.
--

Resolution: Not A Problem

 Move HWI out to Github
 --

 Key: HIVE-1668
 URL: https://issues.apache.org/jira/browse/HIVE-1668
 Project: Hive
  Issue Type: Improvement
  Components: Web UI
Reporter: Jeff Hammerbacher

 I have seen HWI cause a number of build and test errors, and it's now going 
 to cost us some extra work for integration with security. We've worked on 
 hundreds of clusters at Cloudera and I've never seen anyone use HWI. With the 
 Beeswax UI available in Hue, it's unlikely that anyone would prefer to stick 
 with HWI. I think it's time to move it out to Github.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2093) inputs are outputs should be populated for create/drop database

2011-04-05 Thread Namit Jain (JIRA)
inputs are outputs should be populated for create/drop database
---

 Key: HIVE-2093
 URL: https://issues.apache.org/jira/browse/HIVE-2093
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: Siying Dong


This is needed for many other things: concurrency, authorization etc. to work

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2092) support 'drop database DBNAME force';

2011-04-05 Thread Namit Jain (JIRA)
support 'drop database DBNAME force';
---

 Key: HIVE-2092
 URL: https://issues.apache.org/jira/browse/HIVE-2092
 Project: Hive
  Issue Type: New Feature
Reporter: Namit Jain
Assignee: Siying Dong


Currently, the above command fails if the database is not empty.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-2092) support 'drop database DBNAME force';

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-2092.
--

Resolution: Duplicate

Duplicate of HIVE-2090

 support 'drop database DBNAME force';
 ---

 Key: HIVE-2092
 URL: https://issues.apache.org/jira/browse/HIVE-2092
 Project: Hive
  Issue Type: New Feature
Reporter: Namit Jain
Assignee: Siying Dong

 Currently, the above command fails if the database is not empty.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2090) Add DROP DATABASE ... FORCE

2011-04-05 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2090:
--

Attachment: HIVE-2090.2.patch

This is an in-progress patch. It fixed the syntax to CASCADE/RESTRICT instead 
of FORCE. While we had some discussion offline and decided to do the logic in 
object store level, so I need to make some more changes. We'll open other 
issues for fixing concurrency and authorization around dropping and creating 
databases.

 Add DROP DATABASE ... FORCE
 -

 Key: HIVE-2090
 URL: https://issues.apache.org/jira/browse/HIVE-2090
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch


 A DROP DATABASE ... FORCE will be useful, when we use a database for 
 isolation when doing some tests. Being able to force cleaning up the database 
 will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HIVE-2094) CREATE and DROP DATABASE doesn't check user permission for doing it

2011-04-05 Thread Siying Dong (JIRA)
CREATE and DROP DATABASE doesn't check user permission for doing it
---

 Key: HIVE-2094
 URL: https://issues.apache.org/jira/browse/HIVE-2094
 Project: Hive
  Issue Type: Bug
Reporter: Siying Dong
Assignee: He Yongqiang


We need to make sure only users with system permission to do it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

2011-04-05 Thread Marquis Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marquis Wang updated HIVE-1803:
---

Status: Patch Available  (was: Open)

 Implement bitmap indexing in Hive
 -

 Key: HIVE-1803
 URL: https://issues.apache.org/jira/browse/HIVE-1803
 Project: Hive
  Issue Type: New Feature
  Components: Indexing
Reporter: Marquis Wang
Assignee: Marquis Wang
 Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, 
 HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, 
 HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, 
 HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, 
 bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.patch


 Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

2011-04-05 Thread Marquis Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marquis Wang updated HIVE-1803:
---

Attachment: unit-tests.patch
HIVE-1803.11.patch

New patch that fixes the minor javadocs comments from patch 10.

A unit-tests patch that updates all the unit tests that were affected by the 
virtual column change.

 Implement bitmap indexing in Hive
 -

 Key: HIVE-1803
 URL: https://issues.apache.org/jira/browse/HIVE-1803
 Project: Hive
  Issue Type: New Feature
  Components: Indexing
Reporter: Marquis Wang
Assignee: Marquis Wang
 Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, 
 HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, 
 HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, 
 HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, 
 bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.patch


 Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-1988

2011-04-05 Thread Devaraj Das

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/
---

(Updated 2011-04-05 21:24:34.129643)


Review request for hive.


Changes
---

Addressed Amareshwari's comments.


Summary
---

Fixes to some security issues discussed in HIVE-1988


This addresses bug HIVE-1988.
https://issues.apache.org/jira/browse/HIVE-1988


Diffs (updated)
-

  http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
 1089155 

Diff: https://reviews.apache.org/r/528/diff


Testing
---

New unit test added and that passes. All unit tests passed.


Thanks,

Devaraj



[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user

2011-04-05 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016128#comment-13016128
 ] 

jirapos...@reviews.apache.org commented on HIVE-1988:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/
---

(Updated 2011-04-05 21:24:34.129643)


Review request for hive.


Changes
---

Addressed Amareshwari's comments.


Summary
---

Fixes to some security issues discussed in HIVE-1988


This addresses bug HIVE-1988.
https://issues.apache.org/jira/browse/HIVE-1988


Diffs (updated)
-

  http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 1089155 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
 1089155 

Diff: https://reviews.apache.org/r/528/diff


Testing
---

New unit test added and that passes. All unit tests passed.


Thanks,

Devaraj



 Make the delegation token issued by the MetaStore owned by the right user
 -

 Key: HIVE-1988
 URL: https://issues.apache.org/jira/browse/HIVE-1988
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Security, Server Infrastructure
Affects Versions: 0.7.0
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.8.0

 Attachments: hive-1988-3.patch, hive-1988.patch


 The 'owner' of any delegation token issued by the MetaStore is set to the 
 requesting user. When a delegation token is asked by the user himself during 
 a job submission, this is fine. However, in the case where the token is 
 requested for by services (e.g., Oozie), on behalf of the user, the token's 
 owner is set to the user the service is running as. Later on, when the token 
 is used by a MapReduce task, the MetaStore treats the incoming request as 
 coming from Oozie and does operations as Oozie. This means any new directory 
 creations (e.g., create_table) on the hdfs by the MetaStore will end up with 
 Oozie as the owner.
 Also, the MetaStore doesn't check whether a user asking for a token on behalf 
 of some other user, is actually authorized to act on behalf of that other 
 user. We should start using the ProxyUser authorization in the MetaStore 
 (HADOOP-6510's APIs).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user

2011-04-05 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016129#comment-13016129
 ] 

jirapos...@reviews.apache.org commented on HIVE-1988:
-



bq.  On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote:
bq.   
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java,
 line 152
bq.   https://reviews.apache.org/r/528/diff/2/?file=14844#file14844line152
bq.  
bq.   HadoopShims.isSecureShimImpl() is not called anywhere else. Shall we 
remove it if not required anymore?

I suggest we leave it there. This seems like a useful method, and I am actually 
using it in another patch.


bq.  On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote:
bq.   
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java,
 lines 144-156
bq.   https://reviews.apache.org/r/528/diff/2/?file=14850#file14850line144
bq.  
bq.   Do you want to move this into setup(), as it is common in both 
testcases?

Done


bq.  On 2011-04-05 07:52:15, Amareshwari Sriramadasu wrote:
bq.   
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java,
 lines 192-209
bq.   https://reviews.apache.org/r/528/diff/2/?file=14850#file14850line192
bq.  
bq.   code looks duplicated. Can it be refactored by passing group names 
to a method?

Done


- Devaraj


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/#review386
---


On 2011-03-29 10:26:38, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/528/
bq.  ---
bq.  
bq.  (Updated 2011-03-29 10:26:38)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fixes to some security issues discussed in HIVE-1988
bq.  
bq.  
bq.  This addresses bug HIVE-1988.
bq.  https://issues.apache.org/jira/browse/HIVE-1988
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 1085623 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
 1085623 
bq.  
bq.  Diff: https://reviews.apache.org/r/528/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  New unit test added and that passes. All unit tests passed.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Devaraj
bq.  
bq.



 Make the delegation token issued by the MetaStore owned by the right user
 -

 Key: HIVE-1988
 URL: https://issues.apache.org/jira/browse/HIVE-1988
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Security, Server Infrastructure
Affects Versions: 0.7.0
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.8.0

 Attachments: hive-1988-3.patch, hive-1988.patch


 The 'owner' of any delegation token issued by the MetaStore is set to the 
 requesting user. When a delegation token is asked by the user himself during 
 a job submission, this is fine. However, in the case where the token is 
 requested for by services (e.g., Oozie), on behalf of the user, the token's 
 owner is set to the user the service is running as. Later on, when the token 
 is used by a MapReduce task, the MetaStore treats the incoming request as 
 coming from Oozie and does operations as Oozie. This means any new directory 
 creations (e.g., create_table) on the hdfs by the MetaStore will end up with 
 Oozie as the owner.
 Also, the MetaStore doesn't check whether a user asking for a token on behalf 
 of some other user, is actually authorized to 

[jira] [Updated] (HIVE-867) Add add UDFs found in mysql

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-867:


Component/s: UDF

 Add add UDFs found in mysql
 ---

 Key: HIVE-867
 URL: https://issues.apache.org/jira/browse/HIVE-867
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Attachments: hive-867-1.diff, hive-867-10.diff, hive-867-2.diff, 
 hive-867-3.diff, hive-867-7.diff


 Some UDF's that mysql has that hive does not. 
 atan
 aes_decrypt
 aes_encrypt
 bit_and
 bit_count
 bit_length
 bit_or
 bit_xor
 char_length
 char
 character_length
 collation
 compress
 crc32
 encode
 encrypt
 format
 greatest
 in
 inet_oton
 inet_ntoa
 match
 md5
 oct
 ord
 pi
 radians
 sha1 _sha
 sign
 sleep
 truncate

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-2061) Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-2061.
--

   Resolution: Fixed
Fix Version/s: 0.8.0
 Hadoop Flags: [Reviewed]

Committed to trunk.

 Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward 
 compatibility
 --

 Key: HIVE-2061
 URL: https://issues.apache.org/jira/browse/HIVE-2061
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Ning Zhang
Assignee: Ning Zhang
Priority: Minor
 Fix For: 0.8.0

 Attachments: HIVE-2061.patch


 We have seen a use case where in the user's script, it run 'add jar 
 hive_contrib.jar'. Since Hive has moved the jar file to be 
 hive-contrib-{version}.jar, it introduced backward incompatibility. If we as 
 the user to change the script and when Hive upgrade version again, the user 
 need to change the script again. Creating a symlink seems to be the best 
 solution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2061) Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward compatibility

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2061:
-

Component/s: Build Infrastructure

 Create a hive_contrib.jar symlink to hive-contrib-{version}.jar for backward 
 compatibility
 --

 Key: HIVE-2061
 URL: https://issues.apache.org/jira/browse/HIVE-2061
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Reporter: Ning Zhang
Assignee: Ning Zhang
Priority: Minor
 Fix For: 0.8.0

 Attachments: HIVE-2061.patch


 We have seen a use case where in the user's script, it run 'add jar 
 hive_contrib.jar'. Since Hive has moved the jar file to be 
 hive-contrib-{version}.jar, it introduced backward incompatibility. If we as 
 the user to change the script and when Hive upgrade version again, the user 
 need to change the script again. Creating a symlink seems to be the best 
 solution. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1985) better error message for selecting non-existing columns

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1985:
-

Component/s: Query Processor
 Diagnosability

 better error message for selecting non-existing columns
 ---

 Key: HIVE-1985
 URL: https://issues.apache.org/jira/browse/HIVE-1985
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, Query Processor
Reporter: He Yongqiang

 Should have an error message for a query like :
 select a.key,a,a.value from src a;

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2057) eliminate parser warning for Identifier DOT Identifier

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2057:
-

Component/s: Diagnosability

 eliminate parser warning for Identifier DOT Identifier
 

 Key: HIVE-2057
 URL: https://issues.apache.org/jira/browse/HIVE-2057
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, Query Processor
Reporter: John Sichi

 I noticed this warning in recent builds:
 {noformat}
 build-grammar:
  [echo] Building Grammar 
 /data/users/jsichi/open/hive-trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
   
  [java] ANTLR Parser Generator  Version 3.0.1 (August 13, 2007)  1989-2007
  [java] warning(200): 
 /data/users/jsichi/open/hive-trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g:1503:5:
  Decision can match input such as Identifier DOT Identifier using multiple 
 alternatives: 1, 2
  [java] As a result, alternative(s) 2 were disabled for that input
 {noformat}
 This was introduced by HIVE-1517.  Is there a way to get rid of it?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1935) set hive.security.authorization.createtable.owner.grants to null by default

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1935:
-

Component/s: Security

 set hive.security.authorization.createtable.owner.grants to null by default
 ---

 Key: HIVE-1935
 URL: https://issues.apache.org/jira/browse/HIVE-1935
 Project: Hive
  Issue Type: Bug
  Components: Security
Reporter: He Yongqiang
Assignee: He Yongqiang
 Fix For: 0.7.0

 Attachments: HIVE-1935.1.patch


 It seems an empty setting in hive-size.xml does not overwrite hive-default.xml

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-1935) set hive.security.authorization.createtable.owner.grants to null by default

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-1935.
--

   Resolution: Fixed
Fix Version/s: 0.7.0
 Hadoop Flags: [Reviewed]

 set hive.security.authorization.createtable.owner.grants to null by default
 ---

 Key: HIVE-1935
 URL: https://issues.apache.org/jira/browse/HIVE-1935
 Project: Hive
  Issue Type: Bug
  Components: Security
Reporter: He Yongqiang
Assignee: He Yongqiang
 Fix For: 0.7.0

 Attachments: HIVE-1935.1.patch


 It seems an empty setting in hive-size.xml does not overwrite hive-default.xml

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1841:
-

Component/s: Metastore

  datanucleus.fixedDatastore should be true in hive-default.xml
 --

 Key: HIVE-1841
 URL: https://issues.apache.org/jira/browse/HIVE-1841
 Project: Hive
  Issue Type: Improvement
  Components: Configuration, Metastore
Affects Versions: 0.6.0
Reporter: Edward Capriolo
Priority: Minor
 Attachments: HIVE-1841.1.patch.txt


 Two datanucleus variables:
 {noformat}
 property
  namedatanucleus.autoCreateSchema/name
  valuefalse/value
 /property
 property
  namedatanucleus.fixedDatastore/name
  valuetrue/value
 /property
 {noformat}
 are dangerous.  We do want the schema to auto-create itself, but we do not 
 want the schema to auto update itself. 
 Someone might accidentally point a trunk at the wrong meta-store and 
 unknowingly update. I believe we should set this to false and possibly trap 
 exceptions stemming from hive wanting to do any update. This way someone has 
 to actively acknowledge the update, by setting this to true and then starting 
 up hive, or leaving it false, removing schema modifies for the user that hive 
 usages, and doing all the time and doing the updates by hand. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1825) Different defaults for hive.metastore.local

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1825:
-

Component/s: Metastore

 Different defaults for hive.metastore.local
 ---

 Key: HIVE-1825
 URL: https://issues.apache.org/jira/browse/HIVE-1825
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Metastore
Affects Versions: 0.6.0
Reporter: Lars Francke

 hive-default.xml sets {{hive.metastore.local}} to {{true}}. In the code 
 however there is this:
 {code:title=HiveMetaStoreClient.java}
 boolean localMetaStore = conf.getBoolean(hive.metastore.local, false);
 {code}
 This leads to different behaviour depending on whether hbase-default.xml is 
 on the classpath or not.which can lead to some confusion ;-)
 I can supply a patch - should be pretty similar. I just don't  know what the 
 real default should be. My guess would be {{true}}.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1875) On job failure log some messages explaining that Hive is retrieving task completion events

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1875:
-

Component/s: Diagnosability

 On job failure log some messages explaining that Hive is retrieving task 
 completion events
 --

 Key: HIVE-1875
 URL: https://issues.apache.org/jira/browse/HIVE-1875
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, Query Processor
Reporter: Carl Steinbach

 If a job fails, Hive currently displays a link to the task with the most 
 number of failures for easy access to the error logs. However, generating the 
 link may require many RPC's to get all the task completion events, adding a 
 delay of up to 30 minutes. HIVE-1578 added a configuration property that 
 allows the user to disable this behavior.
 This ticket covers adding some logging statements notifying the user that 
 HIve is retrieving this information. This intended to avoid giving the user 
 the impression that the CLI has simply locked up.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1095) Hive in Maven

2011-04-05 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016146#comment-13016146
 ] 

Ning Zhang commented on HIVE-1095:
--

I tried the first command:  ant make-maven -Dversion=0.8.0-SNAPSHOT -logfile 
make-maven.log
and it seems succeeded. I'll attached make-maven.log. It would be nice that 
someone has the knowledge can take a look and see if it is correct. 

I haven't not run the other command to publish maven yet. I can run that as 
long as I get a +1 from committers who has the knowledge. 

 Hive in Maven
 -

 Key: HIVE-1095
 URL: https://issues.apache.org/jira/browse/HIVE-1095
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Affects Versions: 0.6.0
Reporter: Gerrit Jansen van Vuuren
Priority: Minor
 Attachments: HIVE-1095-trunk.patch, HIVE-1095.v2.PATCH, 
 HIVE-1095.v3.PATCH, HIVE-1095.v4.PATCH, HIVE-1095.v5.PATCH, 
 hiveReleasedToMaven.tar.gz


 Getting hive into maven main repositories
 Documentation on how to do this is on:
 http://maven.apache.org/guides/mini/guide-central-repository-upload.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1095) Hive in Maven

2011-04-05 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1095:
-

Attachment: make-maven.log

 Hive in Maven
 -

 Key: HIVE-1095
 URL: https://issues.apache.org/jira/browse/HIVE-1095
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Affects Versions: 0.6.0
Reporter: Gerrit Jansen van Vuuren
Priority: Minor
 Attachments: HIVE-1095-trunk.patch, HIVE-1095.v2.PATCH, 
 HIVE-1095.v3.PATCH, HIVE-1095.v4.PATCH, HIVE-1095.v5.PATCH, 
 hiveReleasedToMaven.tar.gz, make-maven.log


 Getting hive into maven main repositories
 Documentation on how to do this is on:
 http://maven.apache.org/guides/mini/guide-central-repository-upload.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1301) RAND() should be RAND_UNIF(); also, we should create RAND_NORM() and add options

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1301:
-

Component/s: UDF

 RAND() should be RAND_UNIF(); also, we should create RAND_NORM() and add 
 options
 

 Key: HIVE-1301
 URL: https://issues.apache.org/jira/browse/HIVE-1301
 Project: Hive
  Issue Type: Wish
  Components: UDF
Reporter: Adam Kramer
Assignee: Paul Yang

 The generation of pseudorandom data is very useful, but would be even MORE 
 useful if we had a few levers to pull.
 Currently, RAND() generates a random number pulled from a uniform 
 distribution between 0 and 1. It would be great if we could user-specify the 
 min and max because that is a more elegant way to write code: RAND()*200+50 
 will generate the same thing as RAND_UNIF(min=50,max=250) but the latter is a 
 much better way to express this in a readable manner.
 Similarly, it would be useful to have non-uniform random data for statistical 
 purposes. RAND_NORM(mean=0,sd=1) 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1262) Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1262:
-

Component/s: UDF

 Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt
 -

 Key: HIVE-1262
 URL: https://issues.apache.org/jira/browse/HIVE-1262
 Project: Hive
  Issue Type: New Feature
  Components: UDF
Affects Versions: 0.6.0
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Attachments: hive-1262-1.patch.txt


 Add security/checksum UDFs sha,crc32,md5,aes_encrypt, and aes_decrypt

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1360) Allow UDFs to access constant parameter values at compile time

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1360:
-

Component/s: UDF

 Allow UDFs to access constant parameter values at compile time
 --

 Key: HIVE-1360
 URL: https://issues.apache.org/jira/browse/HIVE-1360
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor, UDF
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach

 UDFs should be able to access constant parameter values at compile time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1384) HiveServer should run as the user who submitted the query.

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1384:
-

Component/s: Security

 HiveServer should run as the user who submitted the query.
 --

 Key: HIVE-1384
 URL: https://issues.apache.org/jira/browse/HIVE-1384
 Project: Hive
  Issue Type: Improvement
  Components: Metastore, Security, Server Infrastructure
Reporter: He Yongqiang
Assignee: He Yongqiang



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1343) add an interface in RCFile to support concatenation of two files without (de)compression

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1343:
-

Component/s: Serializers/Deserializers

 add an interface in RCFile to support concatenation of two files without 
 (de)compression
 

 Key: HIVE-1343
 URL: https://issues.apache.org/jira/browse/HIVE-1343
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Affects Versions: 0.6.0
Reporter: Ning Zhang
Assignee: He Yongqiang
 Attachments: HIVE-1343.1.patch


 If two files are concatenated, we need to read each record in these files and 
 write them back to the destination file. The IO cost is mostly unavoidable 
 due to the lack of append functionality in HDFS. However the CPU cost could 
 be significantly reduced by avoiding compression and decompression of the 
 files.
 The File Format layer should provide API that implement the block-level 
 concatenation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1480) CREATE TABLE IF NOT EXISTS get incorrect table name

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1480:
-

Component/s: Query Processor

 CREATE TABLE IF NOT EXISTS get incorrect table name
 ---

 Key: HIVE-1480
 URL: https://issues.apache.org/jira/browse/HIVE-1480
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Ning Zhang
Assignee: Ning Zhang

 CREATE TABLE IF NOT EXISTS T AS SELECT ... gives the following error after 
 the job succeeded:
 Setting total progress to 100
 10/07/22 11:26:14 INFO exec.ExecDriver: Ended Job = job_201006221843_688872
 10/07/22 11:26:14 INFO exec.FileSinkOperator: Moving tmp dir: 
 hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/_tmp.10001
  to: 
 hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/_tmp.10001.intermediate
 10/07/22 11:26:14 INFO exec.FileSinkOperator: Moving tmp dir: 
 hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/_tmp.10001.intermediate
  to: 
 hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/10001
 Moving data to: 
 hdfs://dfstmp.data.facebook.com:9000/user/facebook/warehouse/ericm_budget_email_actua43
 10/07/22 11:26:15 INFO exec.MoveTask: Moving data to: 
 hdfs://dfstmp.data.facebook.com:9000/user/facebook/warehouse/ericm_budget_email_actua43
  from 
 hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/10001
 10/07/22 11:26:15 WARN hdfs.DFSClient: File 
 /user/facebook/warehouse/ericm_budget_email_actua43 is beng deleted only 
 through Trash org.apache.hadoop.fs.FsShell.delete because all deletes must go 
 through Trash.
 10/07/22 11:26:15 INFO hive.log: DDL: struct ericm_budget_email_actua43 { 
 string acct_id, string first_name, string email, string campaign_name_list}
 10/07/22 11:26:15 INFO metastore.HiveMetaStore: 0: create_table: db=default 
 tbl=ericm_budget_email_actua43
 10/07/22 11:26:15 INFO metastore.HiveMetaStore: 0: get_table : db=default 
 tbl=ericm_budget_email_actua43
 10/07/22 11:26:15 INFO hooks.HookUtils: Host:cdb067.snc1.facebook.com 
 database:audit_silver
 10/07/22 11:26:15 INFO hooks.HookUtils: Host:cdb067.snc1.facebook.com 
 database:lineage_silver
 10/07/22 11:26:15 INFO hooks.HookUtils: rows inserted: 1 sql: insert into 
 snc1_command_log set command = ?, command_type = ?, inputs = ?, outputs = ?, 
 queryId = ?, user_info = ?
 OK
 10/07/22 11:26:15 INFO ql.Driver: OK
 10/07/22 11:26:16 INFO ql.Context: getStream error: 
 java.io.FileNotFoundException: File does not exist: 
 hdfs://dfstmp.data.facebook.com:9000/tmp/hive-root/hive_2010-07-22_11-20-15_027_2717837693750284928/1
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:457)
 at 
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:294)
 at org.apache.hadoop.hive.ql.Context.getStream(Context.java:386)
 at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:688)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:146)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:294)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
  
 Time taken: 361.26 seconds
 10/07/22 11:26:16 INFO CliDriver: Time taken: 361.26 seconds
 Exit code: 0, 0
 dus: Cannot access /user/facebook/warehouse/IF: No such file or directory.
 tablesize cmd:/mnt/vol/hive/sites/silver.trunk/hadoop/bin/hadoop dfs -dus 
 /user/facebook/warehouse/IF | cut -d$'\t' -f2
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1625) Added implementation to HivePreparedStatement, HiveBaseResultSet and HiveQueryResultSet.

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1625:
-

Component/s: JDBC

 Added implementation to HivePreparedStatement, HiveBaseResultSet and 
 HiveQueryResultSet.
 

 Key: HIVE-1625
 URL: https://issues.apache.org/jira/browse/HIVE-1625
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Reporter: Sean Flatley
Assignee: Sean Flatley
 Attachments: HIVE-1625.patch, changelog.txt, testJdbcDriver.log


 We implemented several of the HivePreparedStatement set methods, such as 
 setString(int, String) and the means to substitute place holders in the SQL 
 with the values set.  
 HiveQueryResultSet and HiveBaseResultSet were enhanced so that getStatement() 
 could be implemented.
 See attached change log for details.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1665) drop operations may cause file leak

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1665?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1665:
-

Component/s: Metastore

 drop operations may cause file leak
 ---

 Key: HIVE-1665
 URL: https://issues.apache.org/jira/browse/HIVE-1665
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Reporter: He Yongqiang
Assignee: He Yongqiang
 Attachments: hive-1665.1.patch


 Right now when doing a drop, Hive first drops metadata and then drops the 
 actual files. If file system is down at that time, the files will keep not 
 deleted. 
 Had an offline discussion about this:
 to fix this, add a new conf scratch dir into hive conf. 
 when doing a drop operation:
 1) move data to scratch directory
 2) drop metadata
 3) if 2) failed, roll back 1) and report error 3.1
 if 2) succeeded, drop data from scratch directory 3.2
 4) if 3.2 fails, we are ok because we assume the scratch dir will be emptied 
 manually.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1666) retry metadata operation in case of an failure

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1666:
-

Component/s: Metastore

 retry metadata operation in case of an failure
 --

 Key: HIVE-1666
 URL: https://issues.apache.org/jira/browse/HIVE-1666
 Project: Hive
  Issue Type: Improvement
  Components: Metastore, Query Processor
Reporter: Namit Jain
Assignee: Paul Yang

 If a user is trying to insert into a partition,
 insert overwrite table T partition (p) select ..
 it is possible that the directory gets created, but the metadata creation of 
 T@p fails - 
 currently, we will just throw an error. The final directory has been created.
 It will be useful to at-least retry the metadata operation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1667) Store the group of the owner of the table in metastore

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1667:
-

Component/s: Security

 Store the group of the owner of the table in metastore
 --

 Key: HIVE-1667
 URL: https://issues.apache.org/jira/browse/HIVE-1667
 Project: Hive
  Issue Type: New Feature
  Components: Security
Reporter: Namit Jain
 Attachments: hive-1667.patch


 Currently, the group of the owner of the table is not stored in the metastore.
 Secondly, if you create a table, the table's owner group is set to the group 
 for the parent. It is not read from the UGI passed in.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1690) HivePreparedStatement.executeImmediate(String sql) is breaking the exception stack

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1690:
-

Component/s: JDBC

 HivePreparedStatement.executeImmediate(String sql) is breaking the exception 
 stack
 --

 Key: HIVE-1690
 URL: https://issues.apache.org/jira/browse/HIVE-1690
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Reporter: Eli Griv
Priority: Minor

 in HivePreparedStatement.executeImmediate(String sql), the exception stack is 
 broken, so it's impossible to know which method throwed Method not 
 supported 
 FIX :
 HivePreparedStatement.java
 L166
 -   throw new SQLException(e.getMessage(), e.getSQLState(), e.getErrorCode());
 +  throw new SQLException(e.getMessage(), e.getSQLState(), e.getErrorCode(), 
 e);
 L168
 -   throw new SQLException(ex.toString(), 08S01);
 +  throw new SQLException(ex.toString(), 08S01, ex);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1071) Making RCFile concatenatable to reduce the number of files of the output

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1071:
-

Component/s: Serializers/Deserializers

 Making RCFile concatenatable to reduce the number of files of the output
 --

 Key: HIVE-1071
 URL: https://issues.apache.org/jira/browse/HIVE-1071
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Zheng Shao

 Hive automatically determine the number of reducers most of the time.
 Sometimes, we create a lot of small files.
 Hive has an option to merge those small files though a map-reduce job.
 Dhruba has the idea which can fix it even faster:
 if we can make RCFile concatenatable, then we can simply tell the namenode to 
 merge these files.
 Pros: This approach does not do any I/O so it's faster.
 Cons: We have to zero-fill the files to make sure they can be concatenated 
 (all blocks except the last have to be full HDFS blocks).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1189) Add package-info.java to Hive

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1189:
-

Component/s: Diagnosability

 Add package-info.java to Hive
 -

 Key: HIVE-1189
 URL: https://issues.apache.org/jira/browse/HIVE-1189
 Project: Hive
  Issue Type: New Feature
  Components: Build Infrastructure, Diagnosability
Affects Versions: 0.6.0
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: HIVE-1189.1.patch


 Hadoop automatically generates build/src/org/apache/hadoop/package-info.java 
 with information like this:
 {code}
 /*
  * Generated by src/saveVersion.sh
  */
 @HadoopVersionAnnotation(version=0.20.2-dev, revision=826568,
  user=zshao, date=Sun Oct 18 17:46:56 PDT 2009, 
 url=http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20;)
 package org.apache.hadoop;
 {code}
 Hive should do the same thing so that we can easily know the version of the 
 code at runtime.
 This will help us identify whether we are still running the same version of 
 Hive, if we serialize the plan and later continue the execution (See 
 HIVE-1100).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1189) Add package-info.java to Hive

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1189:
-

Component/s: Build Infrastructure

 Add package-info.java to Hive
 -

 Key: HIVE-1189
 URL: https://issues.apache.org/jira/browse/HIVE-1189
 Project: Hive
  Issue Type: New Feature
  Components: Build Infrastructure, Diagnosability
Affects Versions: 0.6.0
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: HIVE-1189.1.patch


 Hadoop automatically generates build/src/org/apache/hadoop/package-info.java 
 with information like this:
 {code}
 /*
  * Generated by src/saveVersion.sh
  */
 @HadoopVersionAnnotation(version=0.20.2-dev, revision=826568,
  user=zshao, date=Sun Oct 18 17:46:56 PDT 2009, 
 url=http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20;)
 package org.apache.hadoop;
 {code}
 Hive should do the same thing so that we can easily know the version of the 
 code at runtime.
 This will help us identify whether we are still running the same version of 
 Hive, if we serialize the plan and later continue the execution (See 
 HIVE-1100).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-613) Hive server fetch row incorrect NULL representation

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-613:


Component/s: Server Infrastructure

 Hive server fetch row incorrect NULL representation
 ---

 Key: HIVE-613
 URL: https://issues.apache.org/jira/browse/HIVE-613
 Project: Hive
  Issue Type: Bug
  Components: Server Infrastructure
Reporter: Eric Hwang
Priority: Minor

 The Hive server fetch function does not correctly serialize null fields in 
 the returned rows. Regardless of the actual null format representation within 
 the table, the Hive server fetch function will always return null fields as 
 NULL,creating a potential conflict with the actual string NULL.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-627) Optimizer should only access RowSchema (and not RowResolver)

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-627:


Component/s: Query Processor

 Optimizer should only access RowSchema (and not RowResolver)
 

 Key: HIVE-627
 URL: https://issues.apache.org/jira/browse/HIVE-627
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Zheng Shao

 The column pruner is accessing RowResolver a lot of times, for things like 
 reverseLookup, and get(alias, column).
 These are not necessary - we should not need to translate an internal name to 
 (alias, column) and then translate back. We should be able to use internal 
 name from one operator to the other, using RowSchema.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-660) Fix UDFLike for multi-line inputs

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-660:


Component/s: UDF

 Fix UDFLike for multi-line inputs
 -

 Key: HIVE-660
 URL: https://issues.apache.org/jira/browse/HIVE-660
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: HIVE-660.1.patch


 We should use DOTALL option in UDFLike, because '%' and '_' should also match 
 to the newline.
 See 
 http://java.sun.com/j2se/1.4.2/docs/api/java/util/regex/Pattern.html#DOTALL

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-664) optimize UDF split

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-664:


Component/s: (was: Query Processor)
 UDF
 Labels: optimization  (was: )

 optimize UDF split
 --

 Key: HIVE-664
 URL: https://issues.apache.org/jira/browse/HIVE-664
 Project: Hive
  Issue Type: Bug
  Components: UDF
Reporter: Namit Jain
  Labels: optimization

 Min Zhou added a comment - 21/Jul/09 07:34 AM
 It's very useful for us .
 some comments:
1. Can you implement it directly with Text ? Avoiding string decoding and 
 encoding would be faster. Of course that trick may lead to another problem, 
 as String.split uses a regular expression for splitting.
2. getDisplayString() always return a string in lowercase.
 [ Show » ]
 Min Zhou added a comment - 21/Jul/09 07:34 AM It's very useful for us . some 
 comments:
1. Can you implement it directly with Text ? Avoiding string decoding and 
 encoding would be faster. Of course that trick may lead to another problem, 
 as String.split uses a regular expression for splitting.
2. getDisplayString() always return a string in lowercase.
 [ Permlink | « Hide ]
 Namit Jain added a comment - 21/Jul/09 09:22 AM
 Committed. Thanks Emil
 [ Show » ]
 Namit Jain added a comment - 21/Jul/09 09:22 AM Committed. Thanks Emil
 [ Permlink | « Hide ]
 Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM
 There are some easy (compromise) ways to optimize split:
 1. Check if the regex argument actually contains some regex specific 
 characters and if it doesn't, do a straightforward split without converting 
 to strings.
 2. Assume some default value for the second argument (for example - 
 split(str) to be equivalent to split(str, ' ') and optimize for this value
 3. Have two separate split functions - one that does regex and one that 
 splits around plain text.
 I think that 1 is a good choice and can be done rather quickly.
 [ Show » ]
 Emil Ibrishimov added a comment - 21/Jul/09 10:48 AM There are some easy 
 (compromise) ways to optimize split: 1. Check if the regex argument actually 
 contains some regex specific characters and if it doesn't, do a 
 straightforward split without converting to strings. 2. Assume some default 
 value for the second argument (for example - split(str) to be equivalent to 
 split(str, ' ') and optimize for this value 3. Have two separate split 
 functions - one that does regex and one that splits around plain text. I 
 think that 1 is a good choice and can be done rather quickly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-538) make hive_jdbc.jar self-containing

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-538:


Component/s: (was: Clients)
 JDBC

 make hive_jdbc.jar self-containing
 --

 Key: HIVE-538
 URL: https://issues.apache.org/jira/browse/HIVE-538
 Project: Hive
  Issue Type: Improvement
  Components: JDBC
Affects Versions: 0.3.0, 0.4.0, 0.6.0
Reporter: Raghotham Murthy

 Currently, most jars in hive/build/dist/lib and the hadoop-*-core.jar are 
 required in the classpath to run jdbc applications on hive. We need to do 
 atleast the following to get rid of most unnecessary dependencies:
 1. get rid of dynamic serde and use a standard serialization format, maybe 
 tab separated, json or avro
 2. dont use hadoop configuration parameters
 3. repackage thrift and fb303 classes into hive_jdbc.jar

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-663) column aliases should be supported

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-663.
-

  Resolution: Duplicate
Hadoop Flags: [Reviewed]

This was fixed a long time ago in some other ticket.

 column aliases should be supported
 --

 Key: HIVE-663
 URL: https://issues.apache.org/jira/browse/HIVE-663
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain

 select key as x from src where x  10;
 should work

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-535) Memory-efficient hash-based Aggregation

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-535:


Component/s: Query Processor
 Labels: optimization  (was: )

 Memory-efficient hash-based Aggregation
 ---

 Key: HIVE-535
 URL: https://issues.apache.org/jira/browse/HIVE-535
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Zheng Shao
  Labels: optimization

 Currently there are a lot of memory overhead in the hash-based aggregation in 
 GroupByOperator.
 The net result is that GroupByOperator won't be able to store many entries in 
 its HashTable, and flushes frequently, and won't be able to achieve very good 
 partial aggregation result.
 Here are some initial thoughts (some of them are from Joydeep long time ago):
 A1. Serialize the key of the HashTable. This will eliminate the 16-byte 
 per-object overhead of Java in keys (depending on how many objects there are 
 in the key, the saving can be substantial).
 A2. Use more memory-efficient hash tables - java.util.HashMap has about 64 
 bytes of overhead per entry.
 A3. Use primitive array to store aggregation results. Basically, the UDAF 
 should manage the array of aggregation results, so UDAFCount should manage a 
 long[], UDAFAvg should manage a double[] and a long[]. The external code 
 should pass an index to iterate/merge/terminal an aggregation result. This 
 will eliminate the 16-byte per-object overhead of Java.
 More ideas are welcome.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-508) Better error message for UDF parameter handling

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-508:


Component/s: UDF
 Diagnosability

 Better error message for UDF parameter handling
 ---

 Key: HIVE-508
 URL: https://issues.apache.org/jira/browse/HIVE-508
 Project: Hive
  Issue Type: Bug
  Components: Diagnosability, UDF
Reporter: Zheng Shao

 {code}
 CREATE TABLE x (a mapstring,string);
 SELECT round(a) FROM x;
 {code}
 This will show an error message:
 FAILED: Unknown exception : 
 org.apache.hadoop.hive.serde2.typeinfo.MapTypeInfo cannot be cast to 
 org.apache.hadoop.hive.serde2.typeinfo.PrimitiveTypeInfo
 We need better error messsage like:
 FAILED: Unable to pass a (type: mapstring,string) to function round.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-475) Lines exceeding mapred.linerecordreader.maxlength should cause exceptions

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-475:


Component/s: Diagnosability

 Lines exceeding mapred.linerecordreader.maxlength should cause exceptions
 -

 Key: HIVE-475
 URL: https://issues.apache.org/jira/browse/HIVE-475
 Project: Hive
  Issue Type: Improvement
  Components: Diagnosability, Serializers/Deserializers
Reporter: S. Alex Smith

 Currently, rows of data that exceed mapred.linerecordreader.maxlength vanish 
 silently.  Instead, an option should be added to indicate what to do under 
 this circumstance (vanish the entire line, truncate after max length, or fail 
 the job), but the default behavior should be job failure.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-71) log details on rows that cause hive exceptions

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-71?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-71:
---

Component/s: Serializers/Deserializers
 Query Processor
 Diagnosability

 log details on rows that cause hive exceptions
 --

 Key: HIVE-71
 URL: https://issues.apache.org/jira/browse/HIVE-71
 Project: Hive
  Issue Type: Bug
  Components: Diagnosability, Logging, Query Processor, 
 Serializers/Deserializers
Reporter: Joydeep Sen Sarma

 users are logging all rows in order to find out the row that's causing 
 exceptions. instead we should just log as much information as possible on the 
 row that causes exception in hive stack

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1362) column level statistics

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1362:
-

Component/s: Statistics

 column level statistics
 ---

 Key: HIVE-1362
 URL: https://issues.apache.org/jira/browse/HIVE-1362
 Project: Hive
  Issue Type: Sub-task
  Components: Statistics
Reporter: Ning Zhang



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1361) table/partition level statistics

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1361:
-

Component/s: Statistics

 table/partition level statistics
 

 Key: HIVE-1361
 URL: https://issues.apache.org/jira/browse/HIVE-1361
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor, Statistics
Reporter: Ning Zhang
 Fix For: 0.7.0

 Attachments: HIVE-1361.2.patch, HIVE-1361.2_java_only.patch, 
 HIVE-1361.3.patch, HIVE-1361.4.java_only.patch, HIVE-1361.4.patch, 
 HIVE-1361.5.java_only.patch, HIVE-1361.5.patch, HIVE-1361.java_only.patch, 
 HIVE-1361.patch, stats0.patch


 At the first step, we gather table-level stats for non-partitioned table and 
 partition-level stats for partitioned table. Future work could extend the 
 table level stats to partitioned table as well. 
 There are 3 major milestones in this subtask: 
  1) extend the insert statement to gather table/partition level stats 
 on-the-fly.
  2) extend metastore API to support storing and retrieving stats for a 
 particular table/partition. 
  3) add an ANALYZE TABLE [PARTITION] statement in Hive QL to gather stats for 
 existing tables/partitions. 
 The proposed stats are:
 Partition-level stats: 
   - number of rows
   - total size in bytes
   - number of files
   - max, min, average row sizes
   - max, min, average file sizes
 Table-level stats in addition to partition level stats:
   - number of partitions

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1940) Query Optimization Using Column Metadata and Histograms

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1940:
-

Component/s: Statistics

 Query Optimization Using Column Metadata and Histograms
 ---

 Key: HIVE-1940
 URL: https://issues.apache.org/jira/browse/HIVE-1940
 Project: Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor, Statistics
Reporter: Anja Gruenheid
 Attachments: HiveMetaStore.pdf


 The current basis for cost-based query optimization in Hive is information 
 gathered on tables and partitions. To make further improvements in query 
 optimization possible, the next step is to develop and implement 
 possibilities to gather information on columns as discussed in issue HIVE-33. 
 After that, an implementation of histograms is a possible option to use and 
 collect run-time statistics. Next to the actual implementation of these 
 features, it is also necessary to develop a consistent storage model for the 
 MetaStore.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-109) 'location' clause for table creation should only be allowed for external tables

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-109:


Component/s: Metastore

 'location' clause for table creation should only be allowed for external 
 tables
 ---

 Key: HIVE-109
 URL: https://issues.apache.org/jira/browse/HIVE-109
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Query Processor
Reporter: Joydeep Sen Sarma

 currently - the code does not by and large distinguish between external and 
 internal tables. one clear distinction though is that storage for external 
 tables is managed outside hive. this leads to consequences like HIVE-86 - so 
 that hive does not mess around with tables whose storage is managed 
 externally. however - currently - we allow users to specify location for 
 internal tables - which is confusing and could lead to data being deleted in 
 external folders. we should not allow this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-55) restrict table and column names to be alphanumeric and _ characters

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-55?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-55:
---

Component/s: Metastore

 restrict table and column names to be alphanumeric and _ characters
 ---

 Key: HIVE-55
 URL: https://issues.apache.org/jira/browse/HIVE-55
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Query Processor
Reporter: Prasad Chakka

 currently the DDL will restrict to alpha-numeric and _ chars but not if the 
 tables were created or altered using metastore clients directly. this JIRA 
 aims to fix that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-80) Add testcases for concurrent query execution

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-80:
---

Component/s: Server Infrastructure
 Labels: concurrency  (was: )

 Add testcases for concurrent query execution
 

 Key: HIVE-80
 URL: https://issues.apache.org/jira/browse/HIVE-80
 Project: Hive
  Issue Type: Test
  Components: Query Processor, Server Infrastructure
Reporter: Raghotham Murthy
Assignee: Arvind Prabhakar
Priority: Critical
  Labels: concurrency
 Attachments: hive_input_format_race-2.patch


 Can use one driver object per query.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-149) Aggregate functions MIN and MAX should support all types

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-149:


Component/s: (was: Query Processor)
 UDF

 Aggregate functions MIN and MAX should support all types
 

 Key: HIVE-149
 URL: https://issues.apache.org/jira/browse/HIVE-149
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: YihueyChyi
Assignee: David Phillips
Priority: Critical



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-156) Allow != in place of

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-156.
-

Resolution: Duplicate

Fixed in HIVE-899.

 Allow != in place of 
 ---

 Key: HIVE-156
 URL: https://issues.apache.org/jira/browse/HIVE-156
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: S. Alex Smith
Priority: Trivial

 I'm used to using != for inequality.  It would be nice if Hive supported 
 this as an alternative to .

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-191) Update methods in Hive class to specify database name

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-191.
-

Resolution: Duplicate

Fixed in HIVE-675.

 Update methods in Hive class to specify database name
 -

 Key: HIVE-191
 URL: https://issues.apache.org/jira/browse/HIVE-191
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Johan Oskarsson
Priority: Minor

 In the query processor module there is a Hive class used to access various 
 Metastore data. Unfortunately most of those methods only work on the default 
 database. We should update them to work on other databases as well by adding 
 a database name parameter. See HIVE-182 for more background information.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-293) report deserialize exceptions from serde's via exceptions

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-293:


Component/s: Diagnosability

 report deserialize exceptions from serde's via exceptions
 -

 Key: HIVE-293
 URL: https://issues.apache.org/jira/browse/HIVE-293
 Project: Hive
  Issue Type: Bug
  Components: Diagnosability, Serializers/Deserializers
Reporter: Joydeep Sen Sarma

 lazyserde and dynamicserde should report exceptions on number (and other) 
 parsing errors so higher layers can take the correct action

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HIVE-301) Ability to store row counts (and other stats) in metastore and obtain them via queries

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-301.
-

Resolution: Duplicate

I think this was covered by the recent stats work.

 Ability to store row counts (and other stats) in metastore and obtain them 
 via queries
 --

 Key: HIVE-301
 URL: https://issues.apache.org/jira/browse/HIVE-301
 Project: Hive
  Issue Type: New Feature
Reporter: Joydeep Sen Sarma

 now that we have insertion row counts being bubbled out of the execution path 
 - it would be good to stash them away in the metastore. It would also be good 
 to have them be viewable by some simple command (like the mysql status 
 commands - but perhaps we have something we could re-use already).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-305) Port Hadoop streaming's counters/status reporters to Hive Transforms

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-305:


Component/s: Query Processor

 Port Hadoop streaming's counters/status reporters to Hive Transforms
 

 Key: HIVE-305
 URL: https://issues.apache.org/jira/browse/HIVE-305
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Venky Iyer

 https://issues.apache.org/jira/browse/HADOOP-1328
  Introduced a way for a streaming process to update global counters and 
 status using stderr stream to emit information. Use 
 reporter:counter:group,counter,amount  to update  a counter. Use 
 reporter:status:message to update status. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-345) Extend Date UDFs to support time zone and full specs as in MySQL

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-345:


Component/s: (was: Query Processor)
 UDF

 Extend Date UDFs to support time zone and full specs as in MySQL
 

 Key: HIVE-345
 URL: https://issues.apache.org/jira/browse/HIVE-345
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Affects Versions: 0.3.0
Reporter: Zheng Shao

 Most of the Date UDF in Hive now are based on String instead of Date objects, 
 and they have limited functionality compared with MySQL.
 http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_from-unixtime
 http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_date-add
 http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_date-sub
 http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_datediff
 We should make it fully compliant with what MySQL offers.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-361) Support seeks in some Hive File Formats

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-361:


Component/s: Serializers/Deserializers

 Support seeks in some Hive File Formats
 ---

 Key: HIVE-361
 URL: https://issues.apache.org/jira/browse/HIVE-361
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Zheng Shao

 Seek support can be useful for a few applications:
 1. Filter out a set of records quickly when the data is sorted on the 
 filtering key;
 2. Create a random sample out of a File.
 This might not be a short-term goal, but let's keep the discussions here so 
 it does not get lost.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-357) Add order-sensitive and order-insensitive hashing aggregation functions (UDAF)

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-357:


Component/s: UDF

 Add order-sensitive and order-insensitive hashing aggregation functions (UDAF)
 --

 Key: HIVE-357
 URL: https://issues.apache.org/jira/browse/HIVE-357
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor, UDF
Reporter: Zheng Shao
Assignee: Zheng Shao

 In order to test whether a new version of Hive produces exactly the same 
 result as an order version, we usually want to run a bunch of big queries as 
 well as small queries.
 It's hard to compare the result of big queries, but if we have a hashing 
 aggregation function, we can just aggregate the result of big queries and 
 compare the single number.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-364) Hive Operators should calculate the value of common expressions just once

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-364:


Component/s: (was: Serializers/Deserializers)
 Query Processor

 Hive Operators should calculate the value of common expressions just once
 -

 Key: HIVE-364
 URL: https://issues.apache.org/jira/browse/HIVE-364
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Zheng Shao

 Currently, if we have t.a + t.b in 2 different expressions in the select 
 clause / where clause, we are computing it twice.
 We should cache the value of the expression evaluation result to save CPU 
 time.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-436) MIN and MAX should inherit type

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-436:


Component/s: UDF

 MIN and MAX should inherit type
 ---

 Key: HIVE-436
 URL: https://issues.apache.org/jira/browse/HIVE-436
 Project: Hive
  Issue Type: Wish
  Components: UDF
Reporter: Adam Kramer

 MIN and MAX functions currently return the DOUBLE type...but really, they 
 should return the same type as the column they operate on.
 In some cases like SUM, it's possible that the result would overflow making 
 DOUBLE more useful as it can drop digits and swap to scientific notation, but 
 MIN and MAX by definition cannot have this problem because the answers are 
 always represented in the column they are run across.
 Easy workaround: CAST all of my MINs and MAXes. It's just a wish.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-441) Convert field, index, AND, OR operators to GenericUDF

2011-04-05 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-441:


Component/s: UDF

 Convert field, index, AND, OR operators to GenericUDF
 -

 Key: HIVE-441
 URL: https://issues.apache.org/jira/browse/HIVE-441
 Project: Hive
  Issue Type: Improvement
  Components: UDF
Reporter: Zheng Shao
Assignee: Zheng Shao

 Once the GenericUDF framework is in, we should convert exprNodeFieldDesc, 
 exprNodeIndexDesc to GenericUDF to simplify the code.
 We should also convert AND and OR to GenericUDF in order to take advantage of 
 short-circuit evaluation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hive-0.7.0-h0.20 #67

2011-04-05 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/67/changes

Changes:

[cws] HIVE-2054. Exception on windows when using the jdbc driver. 'IOException: 
The system cannot find the path specified' (Bennie Schut via cws)

--
[...truncated 27327 lines...]
[junit] Hive history 
file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104051701_762911921.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_17-01-26_771_2233068310906990849/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] 2011-04-05 17:01:29,841 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_17-01-26_771_2233068310906990849/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/build/service/tmp/hive_job_log_hudson_201104051701_458747547.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://hudson.apache.org/hudson/job/Hive-0.7.0-h0.20/ws/hive/data/files/kv1.txt'
 into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-04-05_17-01-31_397_1104374875420903108/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable

[jira] [Created] (HIVE-2096) throw a error if the input is larger than a threshold for index input format

2011-04-05 Thread Namit Jain (JIRA)
throw a error if the input is larger than a threshold for index input format


 Key: HIVE-2096
 URL: https://issues.apache.org/jira/browse/HIVE-2096
 Project: Hive
  Issue Type: Bug
Reporter: Namit Jain
Assignee: He Yongqiang


This can hang for ever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2090) Add DROP DATABASE ... FORCE

2011-04-05 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016200#comment-13016200
 ] 

He Yongqiang commented on HIVE-2090:


can you add authorization check for drop database in this jira?

 Add DROP DATABASE ... FORCE
 -

 Key: HIVE-2090
 URL: https://issues.apache.org/jira/browse/HIVE-2090
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch


 A DROP DATABASE ... FORCE will be useful, when we use a database for 
 isolation when doing some tests. Being able to force cleaning up the database 
 will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2090) Add DROP DATABASE ... FORCE

2011-04-05 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2090:
--

Attachment: HIVE-2090.3.patch

 Add DROP DATABASE ... FORCE
 -

 Key: HIVE-2090
 URL: https://issues.apache.org/jira/browse/HIVE-2090
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch


 A DROP DATABASE ... FORCE will be useful, when we use a database for 
 isolation when doing some tests. Being able to force cleaning up the database 
 will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2090) Add DROP DATABASE ... FORCE

2011-04-05 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2090:
--

Attachment: (was: HIVE-2090.3.patch)

 Add DROP DATABASE ... FORCE
 -

 Key: HIVE-2090
 URL: https://issues.apache.org/jira/browse/HIVE-2090
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch


 A DROP DATABASE ... FORCE will be useful, when we use a database for 
 isolation when doing some tests. Being able to force cleaning up the database 
 will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2090) Add DROP DATABASE ... FORCE

2011-04-05 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2090:
--

Attachment: HIVE-2090.3.patch

 Add DROP DATABASE ... FORCE
 -

 Key: HIVE-2090
 URL: https://issues.apache.org/jira/browse/HIVE-2090
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch


 A DROP DATABASE ... FORCE will be useful, when we use a database for 
 isolation when doing some tests. Being able to force cleaning up the database 
 will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2090) Add DROP DATABASE ... FORCE

2011-04-05 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2090:
--

Status: Patch Available  (was: Open)

Moved the logic of dropping tables to ObjectStore level. The concurrency bug 
will be handled in separate JIRAs, HIVE-2093 and HIVE-2094. 

 Add DROP DATABASE ... FORCE
 -

 Key: HIVE-2090
 URL: https://issues.apache.org/jira/browse/HIVE-2090
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch


 A DROP DATABASE ... FORCE will be useful, when we use a database for 
 isolation when doing some tests. Being able to force cleaning up the database 
 will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2090) Add DROP DATABASE ... FORCE

2011-04-05 Thread Siying Dong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016204#comment-13016204
 ] 

Siying Dong commented on HIVE-2090:
---

Yongqiang, adding authorization check for dropping and creating databases can 
take more efforts then this. I'll see whether it is easy to finish it in 
HIVE-2093 together with concurrency check. It doesn't sound belonging to this 
JIRA.

 Add DROP DATABASE ... FORCE
 -

 Key: HIVE-2090
 URL: https://issues.apache.org/jira/browse/HIVE-2090
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch


 A DROP DATABASE ... FORCE will be useful, when we use a database for 
 isolation when doing some tests. Being able to force cleaning up the database 
 will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Review Request: HIVE-2090: Add DROP DATABASE ... FORCE

2011-04-05 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/551/
---

Review request for hive.


Summary
---

https://issues.apache.org/jira/secure/attachment/12475548/HIVE-2090.3.patch


This addresses bug HIVE-2090.
https://issues.apache.org/jira/browse/HIVE-2090


Diffs
-

  trunk/metastore/if/hive_metastore.thrift 1088810 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1088810 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1088810 
  
trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 
1088810 
  
trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1088810 
  trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 
1088810 
  
trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 
1088810 
  trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 
1088810 
  trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1088810 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1088810 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1088810 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
1088810 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1088810 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1088810 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1088810 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1088810 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DropDatabaseDesc.java 
1088810 
  trunk/ql/src/test/queries/clientpositive/database.q 1088810 
  trunk/ql/src/test/results/clientpositive/database.q.out 1088810 

Diff: https://reviews.apache.org/r/551/diff


Testing
---


Thanks,

Carl



[jira] [Commented] (HIVE-2090) Add DROP DATABASE ... FORCE

2011-04-05 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016205#comment-13016205
 ] 

jirapos...@reviews.apache.org commented on HIVE-2090:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/551/
---

Review request for hive.


Summary
---

https://issues.apache.org/jira/secure/attachment/12475548/HIVE-2090.3.patch


This addresses bug HIVE-2090.
https://issues.apache.org/jira/browse/HIVE-2090


Diffs
-

  trunk/metastore/if/hive_metastore.thrift 1088810 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1088810 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1088810 
  
trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 
1088810 
  
trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1088810 
  trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 
1088810 
  
trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 
1088810 
  trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 
1088810 
  trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1088810 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1088810 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1088810 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
1088810 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1088810 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1088810 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1088810 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1088810 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DropDatabaseDesc.java 
1088810 
  trunk/ql/src/test/queries/clientpositive/database.q 1088810 
  trunk/ql/src/test/results/clientpositive/database.q.out 1088810 

Diff: https://reviews.apache.org/r/551/diff


Testing
---


Thanks,

Carl



 Add DROP DATABASE ... FORCE
 -

 Key: HIVE-2090
 URL: https://issues.apache.org/jira/browse/HIVE-2090
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch


 A DROP DATABASE ... FORCE will be useful, when we use a database for 
 isolation when doing some tests. Being able to force cleaning up the database 
 will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2068) Speed up query select xx,xx from xxx LIMIT xxx if no filtering or aggregation

2011-04-05 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2068:
--

Status: Patch Available  (was: Open)

 Speed up query select xx,xx from xxx LIMIT xxx if no filtering or 
 aggregation
 ---

 Key: HIVE-2068
 URL: https://issues.apache.org/jira/browse/HIVE-2068
 Project: Hive
  Issue Type: Improvement
Reporter: Siying Dong
Assignee: Siying Dong
 Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, 
 HIVE-2068.4.patch


 Currently, select xx,xx from xxx where ...(only partition conditions) LIMIT 
 xxx will start a MapReduce job with input to be the whole table or 
 partition. The latency can be huge if the table or partition is big. We could 
 reduce number of input files to speed up the queries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-2068) Speed up query select xx,xx from xxx LIMIT xxx if no filtering or aggregation

2011-04-05 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2068:
--

Attachment: HIVE-2068.4.patch

addressing Namit's comments.

 Speed up query select xx,xx from xxx LIMIT xxx if no filtering or 
 aggregation
 ---

 Key: HIVE-2068
 URL: https://issues.apache.org/jira/browse/HIVE-2068
 Project: Hive
  Issue Type: Improvement
Reporter: Siying Dong
Assignee: Siying Dong
 Attachments: HIVE-2068.1.patch, HIVE-2068.2.patch, HIVE-2068.3.patch, 
 HIVE-2068.4.patch


 Currently, select xx,xx from xxx where ...(only partition conditions) LIMIT 
 xxx will start a MapReduce job with input to be the whole table or 
 partition. The latency can be huge if the table or partition is big. We could 
 reduce number of input files to speed up the queries.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-2090) Add DROP DATABASE ... FORCE

2011-04-05 Thread Siying Dong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016209#comment-13016209
 ] 

Siying Dong commented on HIVE-2090:
---

One thing to notice. When dropping all tables on a database, all files on the 
warehouse root of the DB are deleted. Data from tables/partitions on locations 
that are not under that root won't be deleted. This is kind of similar to the 
current approach of dropping table -- data in partitions won't be deleted if 
their locations are not under table's locations.

 Add DROP DATABASE ... FORCE
 -

 Key: HIVE-2090
 URL: https://issues.apache.org/jira/browse/HIVE-2090
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2090.1.patch, HIVE-2090.2.patch, HIVE-2090.3.patch


 A DROP DATABASE ... FORCE will be useful, when we use a database for 
 isolation when doing some tests. Being able to force cleaning up the database 
 will make test cleaning up easier.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-1803) Implement bitmap indexing in Hive

2011-04-05 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1803:
-

Status: Open  (was: Patch Available)

I'm getting test failures still.

* TestMinimrCliDriver:join1
* TestMTQueries:testMTQueries1
* TestParse:  44/45 tests failed

These all need fixes before commit.


 Implement bitmap indexing in Hive
 -

 Key: HIVE-1803
 URL: https://issues.apache.org/jira/browse/HIVE-1803
 Project: Hive
  Issue Type: New Feature
  Components: Indexing
Reporter: Marquis Wang
Assignee: Marquis Wang
 Attachments: HIVE-1803.1.patch, HIVE-1803.10.patch, 
 HIVE-1803.11.patch, HIVE-1803.2.patch, HIVE-1803.3.patch, HIVE-1803.4.patch, 
 HIVE-1803.5.patch, HIVE-1803.6.patch, HIVE-1803.7.patch, HIVE-1803.8.patch, 
 HIVE-1803.9.patch, JavaEWAH_20110304.zip, bitmap_index_1.png, 
 bitmap_index_2.png, javaewah.jar, javaewah.jar, unit-tests.patch


 Implement bitmap index handler to complement compact indexing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-1988) Make the delegation token issued by the MetaStore owned by the right user

2011-04-05 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016241#comment-13016241
 ] 

jirapos...@reviews.apache.org commented on HIVE-1988:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/#review394
---

Ship it!


+1

- Amareshwari


On 2011-04-05 21:24:34, Devaraj Das wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/528/
bq.  ---
bq.  
bq.  (Updated 2011-04-05 21:24:34)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Fixes to some security issues discussed in HIVE-1988
bq.  
bq.  
bq.  This addresses bug HIVE-1988.
bq.  https://issues.apache.org/jira/browse/HIVE-1988
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 1089155 
bq.
http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
 1089155 
bq.  
bq.  Diff: https://reviews.apache.org/r/528/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  New unit test added and that passes. All unit tests passed.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Devaraj
bq.  
bq.



 Make the delegation token issued by the MetaStore owned by the right user
 -

 Key: HIVE-1988
 URL: https://issues.apache.org/jira/browse/HIVE-1988
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Security, Server Infrastructure
Affects Versions: 0.7.0
Reporter: Devaraj Das
Assignee: Devaraj Das
 Fix For: 0.8.0

 Attachments: hive-1988-3.patch, hive-1988.patch


 The 'owner' of any delegation token issued by the MetaStore is set to the 
 requesting user. When a delegation token is asked by the user himself during 
 a job submission, this is fine. However, in the case where the token is 
 requested for by services (e.g., Oozie), on behalf of the user, the token's 
 owner is set to the user the service is running as. Later on, when the token 
 is used by a MapReduce task, the MetaStore treats the incoming request as 
 coming from Oozie and does operations as Oozie. This means any new directory 
 creations (e.g., create_table) on the hdfs by the MetaStore will end up with 
 Oozie as the owner.
 Also, the MetaStore doesn't check whether a user asking for a token on behalf 
 of some other user, is actually authorized to act on behalf of that other 
 user. We should start using the ProxyUser authorization in the MetaStore 
 (HADOOP-6510's APIs).

--
This message is automatically generated by JIRA.
For more 

[jira] [Commented] (HIVE-1095) Hive in Maven

2011-04-05 Thread Amareshwari Sriramadasu (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016242#comment-13016242
 ] 

Amareshwari Sriramadasu commented on HIVE-1095:
---

bq. I tried the first command: ant make-maven -Dversion=0.8.0-SNAPSHOT -logfile 
make-maven.log and it seems succeeded.
It succeeded for me too, where as maven-publish failed with 401/authorization 
errors.

bq. It would be nice that someone has the knowledge can take a look and see if 
it is correct. 
Giri/Carl, can you help here?

 Hive in Maven
 -

 Key: HIVE-1095
 URL: https://issues.apache.org/jira/browse/HIVE-1095
 Project: Hive
  Issue Type: Task
  Components: Build Infrastructure
Affects Versions: 0.6.0
Reporter: Gerrit Jansen van Vuuren
Priority: Minor
 Attachments: HIVE-1095-trunk.patch, HIVE-1095.v2.PATCH, 
 HIVE-1095.v3.PATCH, HIVE-1095.v4.PATCH, HIVE-1095.v5.PATCH, 
 hiveReleasedToMaven.tar.gz, make-maven.log


 Getting hive into maven main repositories
 Documentation on how to do this is on:
 http://maven.apache.org/guides/mini/guide-central-repository-upload.html

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Review Request: HIVE-1988

2011-04-05 Thread Amareshwari Sriramadasu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/528/#review394
---

Ship it!


+1

- Amareshwari


On 2011-04-05 21:24:34, Devaraj Das wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/528/
 ---
 
 (Updated 2011-04-05 21:24:34)
 
 
 Review request for hive.
 
 
 Summary
 ---
 
 Fixes to some security issues discussed in HIVE-1988
 
 
 This addresses bug HIVE-1988.
 https://issues.apache.org/jira/browse/HIVE-1988
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
 1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
  1089155 
   
 http://svn.apache.org/repos/asf/hive/trunk/shims/src/test/org/apache/hadoop/hive/thrift/TestHadoop20SAuthBridge.java
  1089155 
 
 Diff: https://reviews.apache.org/r/528/diff
 
 
 Testing
 ---
 
 New unit test added and that passes. All unit tests passed.
 
 
 Thanks,
 
 Devaraj
 




  1   2   >