[jira] [Commented] (HIVE-8045) SQL standard auth with cli - Errors and configuration issues

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140051#comment-14140051
 ] 

Hive QA commented on HIVE-8045:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669816/HIVE-8045.3.patch

{color:green}SUCCESS:{color} +1 6295 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/870/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/870/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-870/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669816

 SQL standard auth with cli - Errors and configuration issues
 

 Key: HIVE-8045
 URL: https://issues.apache.org/jira/browse/HIVE-8045
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Jagruti Varia
Assignee: Thejas M Nair
 Attachments: HIVE-8045.1.patch, HIVE-8045.2.patch, HIVE-8045.3.patch


 HIVE-7533 enabled sql std authorization to be set in hive cli (without 
 enabling authorization checks). This updates hive configuration so that 
 create-table and create-views set permissions appropriately for the owner of 
 the table.
 HIVE-7209 added a metastore authorization provider that can be used to 
 restricts calls made to the authorization api, so that only HS2 can make 
 those calls (when HS2 uses embedded metastore).
 Some issues were found with this.
 # Even if hive.security.authorization.enabled=false, authorization checks 
 were happening for non sql statements as add/detete/dfs/compile, which 
 results in MetaStoreAuthzAPIAuthorizerEmbedOnly throwing an error.
 # Create table from hive-cli ended up calling metastore server api call 
 (getRoles) and resulted in  MetaStoreAuthzAPIAuthorizerEmbedOnly throwing an 
 error.
 # Some users prefer to enable authorization using hive-site.xml for 
 hive-server2 (hive.security.authorization.enabled param). If this file is 
 shared by hive-cli and hive-server2,  SQL std authorizer throws an error 
 because is use in hive-cli is not allowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop

2014-09-19 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-8188:
--
Attachment: udf-deterministic.png

 ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
 loop
 -

 Key: HIVE-8188
 URL: https://issues.apache.org/jira/browse/HIVE-8188
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.14.0
Reporter: Gopal V
 Attachments: udf-deterministic.png


 When running a near-constant UDF, most of the CPU is burnt within the VM 
 trying to read the class annotations for every row.
 !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop

2014-09-19 Thread Gopal V (JIRA)
Gopal V created HIVE-8188:
-

 Summary: ExprNodeGenericFuncEvaluator::_evaluate() loads class 
annotations in a tight loop
 Key: HIVE-8188
 URL: https://issues.apache.org/jira/browse/HIVE-8188
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.14.0
Reporter: Gopal V
 Attachments: udf-deterministic.png

When running a near-constant UDF, most of the CPU is burnt within the VM trying 
to read the class annotations for every row.

!udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7325) Support non-constant expressions for ARRAY/MAP type indices.

2014-09-19 Thread Lefty Leverenz (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140110#comment-14140110
 ] 

Lefty Leverenz commented on HIVE-7325:
--

The doc looks good, thanks [~jdere].

 Support non-constant expressions for ARRAY/MAP type indices.
 

 Key: HIVE-7325
 URL: https://issues.apache.org/jira/browse/HIVE-7325
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1
Reporter: Mala Chikka Kempanna
Assignee: Navis
 Fix For: 0.14.0

 Attachments: HIVE-7325.1.patch.txt, HIVE-7325.2.patch.txt, 
 HIVE-7325.3.patch.txt, HIVE-7325.4.patch.txt


 Here is my sample:
 {code}
 CREATE TABLE RECORD(RecordID string, BatchDate string, Country string) 
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
 WITH SERDEPROPERTIES (hbase.columns.mapping = :key,D:BatchDate,D:Country) 
 TBLPROPERTIES (hbase.table.name = RECORD); 
 CREATE TABLE KEY_RECORD(KeyValue String, RecordId mapstring,string) 
 STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
 WITH SERDEPROPERTIES (hbase.columns.mapping = :key, K:) 
 TBLPROPERTIES (hbase.table.name = KEY_RECORD); 
 {code}
 The following join statement doesn't work. 
 {code}
 SELECT a.*, b.* from KEY_RECORD a join RECORD b 
 WHERE a.RecordId[b.RecordID] is not null;
 {code}
 FAILED: SemanticException 2:16 Non-constant expression for map indexes not 
 supported. Error encountered near token 'RecordID' 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8045) SQL standard auth with cli - Errors and configuration issues

2014-09-19 Thread Jason Dere (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140115#comment-14140115
 ] 

Jason Dere commented on HIVE-8045:
--

+1

 SQL standard auth with cli - Errors and configuration issues
 

 Key: HIVE-8045
 URL: https://issues.apache.org/jira/browse/HIVE-8045
 Project: Hive
  Issue Type: Bug
  Components: Authorization
Reporter: Jagruti Varia
Assignee: Thejas M Nair
 Attachments: HIVE-8045.1.patch, HIVE-8045.2.patch, HIVE-8045.3.patch


 HIVE-7533 enabled sql std authorization to be set in hive cli (without 
 enabling authorization checks). This updates hive configuration so that 
 create-table and create-views set permissions appropriately for the owner of 
 the table.
 HIVE-7209 added a metastore authorization provider that can be used to 
 restricts calls made to the authorization api, so that only HS2 can make 
 those calls (when HS2 uses embedded metastore).
 Some issues were found with this.
 # Even if hive.security.authorization.enabled=false, authorization checks 
 were happening for non sql statements as add/detete/dfs/compile, which 
 results in MetaStoreAuthzAPIAuthorizerEmbedOnly throwing an error.
 # Create table from hive-cli ended up calling metastore server api call 
 (getRoles) and resulted in  MetaStoreAuthzAPIAuthorizerEmbedOnly throwing an 
 error.
 # Some users prefer to enable authorization using hive-site.xml for 
 hive-server2 (hive.security.authorization.enabled param). If this file is 
 shared by hive-cli and hive-server2,  SQL std authorizer throws an error 
 because is use in hive-cli is not allowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7984) AccumuloOutputFormat Configuration items from StorageHandler not re-set in Configuration in Tez

2014-09-19 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-7984:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks [~elserj]!

 AccumuloOutputFormat Configuration items from StorageHandler not re-set in 
 Configuration in Tez
 ---

 Key: HIVE-7984
 URL: https://issues.apache.org/jira/browse/HIVE-7984
 Project: Hive
  Issue Type: Bug
  Components: StorageHandler, Tez
Reporter: Josh Elser
Assignee: Josh Elser
 Fix For: 0.14.0

 Attachments: HIVE-7984-1.diff, HIVE-7984-1.patch, HIVE-7984.1.patch


 Ran AccumuloStorageHandler queries with Tez and found that configuration 
 elements that are pulled from the {{-hiveconf}} and passed to the 
 inputJobProperties or outputJobProperties by the AccumuloStorageHandler 
 aren't available inside of the Tez container.
 I'm guessing that there is a disconnect from the configuration that the 
 StorageHandler creates and what the Tez container sees.
 The HBaseStorageHandler likely doesn't run into this because it expects to 
 have hbase-site.xml available via tmpjars (and can extrapolate connection 
 information from that file). Accumulo's site configuration file is not meant 
 to be shared with consumers which means that this exact approach is not 
 sufficient.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7946) CBO: Merge CBO changes to Trunk

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140144#comment-14140144
 ] 

Hive QA commented on HIVE-7946:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669918/HIVE-7946.13.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 6295 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_if
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_correctness
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_correlationoptimizer1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mapjoin_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_metadataonly1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_mrr
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_bmj_schema_evolution
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_union
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_left_outer_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_mapjoin_reduce
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_mapjoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_nested_mapjoin
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/871/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/871/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-871/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669918

 CBO: Merge CBO changes to Trunk
 ---

 Key: HIVE-7946
 URL: https://issues.apache.org/jira/browse/HIVE-7946
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-7946.1.patch, HIVE-7946.10.patch, 
 HIVE-7946.11.patch, HIVE-7946.12.patch, HIVE-7946.13.patch, 
 HIVE-7946.2.patch, HIVE-7946.3.patch, HIVE-7946.4.patch, HIVE-7946.5.patch, 
 HIVE-7946.6.patch, HIVE-7946.7.patch, HIVE-7946.8.patch, HIVE-7946.9.patch, 
 HIVE-7946.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8138) Global Init file should allow specifying file name not only directory

2014-09-19 Thread Vaibhav Gumashta (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140152#comment-14140152
 ] 

Vaibhav Gumashta commented on HIVE-8138:


[~dongc] I fixed the param naming as part of HIVE-7935. 

 Global Init file should allow specifying file name  not only directory
 --

 Key: HIVE-8138
 URL: https://issues.apache.org/jira/browse/HIVE-8138
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8138.patch


 HIVE-5160 allows you to specify a directory where a .hiverc file exists. 
 However since .hiverc is a hidden file this can be confusing. The property 
 should allow a path to a file or a directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop

2014-09-19 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140154#comment-14140154
 ] 

Prasanth J commented on HIVE-8188:
--

I think its because hash-aggregation needs to estimate the size of the hash 
map. The values of the hashmaps are UDAFs whose aggregation buffer size can be 
estimated if the aggregation buffer has this annotation 
@AggregationType(estimable = true). GroupByOperator.shouldBeFlushed() is 
called for every row that is added to hash map. shouldBeFlushed() calls 
isEstimable() helper function which uses reflection every time to see if the 
aggregation function is estimable. Not sure why it is done this way but yes 
this will be slow as hell. This needs to be fixed.

 ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
 loop
 -

 Key: HIVE-8188
 URL: https://issues.apache.org/jira/browse/HIVE-8188
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.14.0
Reporter: Gopal V
 Attachments: udf-deterministic.png


 When running a near-constant UDF, most of the CPU is burnt within the VM 
 trying to read the class annotations for every row.
 !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7980) Hive on spark issue..

2014-09-19 Thread alton.jung (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140164#comment-14140164
 ] 

alton.jung commented on HIVE-7980:
--

Thanks for it..

I got really confused about current version(hive on spark)..
I succeeded with query through hive cli, but when i tested it with beeline or 
jdbc.. I always met error...
I wonder current version can support query with jdbc and beeline..

[Error]
java.lang.NullPointerException
at 
org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1262)
at 
org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1269)
at 
org.apache.spark.SparkContext.hadoopRDD$default$5(SparkContext.scala:537)
at 
org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
at 
org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
at 
org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:76)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)

 Hive on spark issue..
 -

 Key: HIVE-7980
 URL: https://issues.apache.org/jira/browse/HIVE-7980
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Spark
Affects Versions: spark-branch
 Environment: Test Environment is..
 . hive 0.14.0(spark branch version)
 . spark 
 (http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data/spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0.jar)
 . hadoop 2.4.0 (yarn)
Reporter: alton.jung
Assignee: Chao
 Fix For: spark-branch


 .I followed this 
 guide(https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started).
  and i compiled hive from spark branch. in the next step i met the below 
 error..
 (*i typed the hive query on beeline, i used the  simple query using order 
 by to invoke the palleral works 
ex) select * from test where id = 1 order by id;
 )
 [Error list is]
 2014-09-04 02:58:08,796 ERROR spark.SparkClient 
 (SparkClient.java:execute(158)) - Error generating Spark Plan
 java.lang.NullPointerException
   at 
 org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1262)
   at 
 org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1269)
   at 
 org.apache.spark.SparkContext.hadoopRDD$default$5(SparkContext.scala:537)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
   at 
 org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:77)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 2014-09-04 02:58:11,108 ERROR ql.Driver (SessionState.java:printError(696)) - 
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask
 2014-09-04 02:58:11,182 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=Driver.execute 
 start=1409824527954 end=1409824691182 duration=163228 
 from=org.apache.hadoop.hive.ql.Driver
 2014-09-04 02:58:11,223 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=releaseLocks 
 from=org.apache.hadoop.hive.ql.Driver
 2014-09-04 02:58:11,224 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=releaseLocks 
 start=1409824691223 end=1409824691224 duration=1 
 from=org.apache.hadoop.hive.ql.Driver
 2014-09-04 02:58:11,306 ERROR operation.Operation 
 (SQLOperation.java:run(199)) - Error running hive query: 
 org.apache.hive.service.cli.HiveSQLException: Error while processing 
 statement: FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask
   at 
 org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:284)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:146)
   

[jira] [Updated] (HIVE-8179) Fetch task conversion: Remove some dependencies on AST

2014-09-19 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8179:
-
Status: Patch Available  (was: Open)

 Fetch task conversion: Remove some dependencies on AST
 --

 Key: HIVE-8179
 URL: https://issues.apache.org/jira/browse/HIVE-8179
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-8179.1.patch, HIVE-8179.2.patch


 fetch task conversion is does some strange things:
 For instance: select * from (select * from x) t, wont get converted even 
 though it's the exact same operator plan as: select * from x.
 Or: select * from foo will get converted with minimal, but select list all 
 columns of foo from foo won't.
 We also check the AST for group by etc, but then do the same thing in the 
 operator tree again.
 I'm also wondering why we ship with moar as default, but test with 
 minimal in the unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8179) Fetch task conversion: Remove some dependencies on AST

2014-09-19 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8179:
-
Status: Open  (was: Patch Available)

 Fetch task conversion: Remove some dependencies on AST
 --

 Key: HIVE-8179
 URL: https://issues.apache.org/jira/browse/HIVE-8179
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-8179.1.patch, HIVE-8179.2.patch


 fetch task conversion is does some strange things:
 For instance: select * from (select * from x) t, wont get converted even 
 though it's the exact same operator plan as: select * from x.
 Or: select * from foo will get converted with minimal, but select list all 
 columns of foo from foo won't.
 We also check the AST for group by etc, but then do the same thing in the 
 operator tree again.
 I'm also wondering why we ship with moar as default, but test with 
 minimal in the unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8179) Fetch task conversion: Remove some dependencies on AST

2014-09-19 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-8179:
-
Attachment: HIVE-8179.2.patch

 Fetch task conversion: Remove some dependencies on AST
 --

 Key: HIVE-8179
 URL: https://issues.apache.org/jira/browse/HIVE-8179
 Project: Hive
  Issue Type: Bug
Reporter: Gunther Hagleitner
Assignee: Gunther Hagleitner
 Attachments: HIVE-8179.1.patch, HIVE-8179.2.patch


 fetch task conversion is does some strange things:
 For instance: select * from (select * from x) t, wont get converted even 
 though it's the exact same operator plan as: select * from x.
 Or: select * from foo will get converted with minimal, but select list all 
 columns of foo from foo won't.
 We also check the AST for group by etc, but then do the same thing in the 
 operator tree again.
 I'm also wondering why we ship with moar as default, but test with 
 minimal in the unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140209#comment-14140209
 ] 

Hive QA commented on HIVE-7482:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669838/HIVE-7482.5.patch

{color:red}ERROR:{color} -1 due to 68 failed/errored test(s), 6310 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketcontext_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketizedhiveinputformat_auto
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_mine
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nulls
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_nullsafe
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_parquet_join
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_11
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_12
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_16
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_smb_mapjoin_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sort_merge_join_desc_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udaf_corr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_bucketmapjoin1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_1
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucketmapjoin6
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_map_operators
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_quotedid_smb
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_smb_mapjoin_8
org.apache.hadoop.hive.ql.io.TestHiveBinarySearchRecordReader.testEqualOpClass
org.apache.hadoop.hive.ql.io.TestHiveBinarySearchRecordReader.testGreaterThanOpClass
org.apache.hadoop.hive.ql.io.TestHiveBinarySearchRecordReader.testGreaterThanOrEqualOpClass

Re: Review Request 25575: HIVE-7615: Beeline should have an option for user to see the query progress

2014-09-19 Thread Dong Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25575/
---

(Updated Sept. 19, 2014, 9:22 a.m.)


Review request for hive.


Changes
---

Update the patch based on comments. Mainly change the HiveStatement exposed 
public API to minimal. So remove the QueryState.


Repository: hive-git


Description
---

When executing query in Beeline, user should have a option to see the progress 
through the outputs. Beeline could use the API introduced in HIVE-4629 to get 
and display the logs to the client.


Diffs (updated)
-

  beeline/pom.xml 45fa02b 
  beeline/src/java/org/apache/hive/beeline/Commands.java a92d69f 
  
itests/hive-unit/src/test/java/org/apache/hive/beeline/TestBeeLineWithArgs.java 
1e66542 
  itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestJdbcDriver2.java 
daf8e9e 
  jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java 2cbf58c 

Diff: https://reviews.apache.org/r/25575/diff/


Testing
---

UT passed.


Thanks,

Dong Chen



[jira] [Commented] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop

2014-09-19 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140232#comment-14140232
 ] 

Prasanth J commented on HIVE-8188:
--

I tried to avoid this reflection invocation multiple times in inner loop by 
computing total aggregation size once and reusing it in inner loop. I ran the 
following query
{code}
select ss_quantity, ss_store_sk, ss_promo_sk, count(ss_list_price), 
count(ss_sales_price), sum(ss_ext_sales_price) from store_sales_orc group by 
ss_quantity,ss_store_sk,ss_promo_sk;
{code}

store_sales had 2880404 rows. The original execution time was 18.5s and with 
the above changes the time went down to 15.5s which is ~17% gain which explains 
the reflection cost from the attached image.

 ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
 loop
 -

 Key: HIVE-8188
 URL: https://issues.apache.org/jira/browse/HIVE-8188
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.14.0
Reporter: Gopal V
 Attachments: udf-deterministic.png


 When running a near-constant UDF, most of the CPU is burnt within the VM 
 trying to read the class annotations for every row.
 !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7615) Beeline should have an option for user to see the query progress

2014-09-19 Thread Dong Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dong Chen updated HIVE-7615:

Attachment: HIVE-7615.2.patch

Hi [~thejas], I have updated the patch based on your comments. I agree that we 
should minimize the exposed public API to avoid confusion for users. This is a 
precious comment to make the interface design better.

In the patch, I remove the QueryState, and use a private boolean 
isExecuteStatementFailed to let getQueryLog() method specify the throwed 
exceptions. Caller could know what exactly happened according to the exceptions.
The reason I did not use a boolean isRunning is that, I think when it is false, 
the state not Running actually could be divided into two state: not Running 
before the statement is executed, and not Running after the statement is 
executed successfully. If we use this boolean to control getQueryLog and make 
it failed for not Running, jdbc user may not get log after query is done.

Could you please take a look at it and see how does that sound? :)



 Beeline should have an option for user to see the query progress
 

 Key: HIVE-7615
 URL: https://issues.apache.org/jira/browse/HIVE-7615
 Project: Hive
  Issue Type: Improvement
  Components: CLI
Reporter: Dong Chen
Assignee: Dong Chen
 Attachments: HIVE-7615.1.patch, HIVE-7615.2.patch, HIVE-7615.patch, 
 complete_logs, simple_logs


 When executing query in Beeline, user should have a option to see the 
 progress through the outputs.
 Beeline could use the API introduced in HIVE-4629 to get and display the logs 
 to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8185) hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in build

2014-09-19 Thread Damien Carol (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140286#comment-14140286
 ] 

Damien Carol commented on HIVE-8185:


I wonder if this bug is only present in CBO branch.
Because I use trunk and I have not this bug BUT when I'm using CBO branch 
Metastore/hiveserver2 or beeline throw this error.

 hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in 
 build
 ---

 Key: HIVE-8185
 URL: https://issues.apache.org/jira/browse/HIVE-8185
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.14.0
Reporter: Gopal V
Priority: Critical
 Attachments: HIVE-8185.1.patch, HIVE-8185.2.patch


 In the current build, running
 {code}
 jarsigner --verify ./lib/hive-jdbc-0.14.0-SNAPSHOT-standalone.jar
 Jar verification failed.
 {code}
 unless that jar is removed from the lib dir, all hive queries throw the 
 following error 
 {code}
 Exception in thread main java.lang.SecurityException: Invalid signature 
 file digest for Manifest main attributes
   at 
 sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:240)
   at 
 sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:193)
   at java.util.jar.JarVerifier.processEntry(JarVerifier.java:305)
   at java.util.jar.JarVerifier.update(JarVerifier.java:216)
   at java.util.jar.JarFile.initializeVerifier(JarFile.java:345)
   at java.util.jar.JarFile.getInputStream(JarFile.java:412)
   at 
 sun.misc.URLClassPath$JarLoader$2.getInputStream(URLClassPath.java:775)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7974) Notification Event Listener movement to a new top level repl/ module

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140297#comment-14140297
 ] 

Hive QA commented on HIVE-7974:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669536/HIVE-7974.patch

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 6293 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/873/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/873/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-873/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669536

 Notification Event Listener movement to a new top level repl/ module
 

 Key: HIVE-7974
 URL: https://issues.apache.org/jira/browse/HIVE-7974
 Project: Hive
  Issue Type: Sub-task
  Components: Import/Export
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-7974.patch


 We need to create a new hive module (say hive-repl? ) to subsume the 
 NotificationListener from HCatalog.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140399#comment-14140399
 ] 

Hive QA commented on HIVE-8184:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669842/HIVE-8184.1.patch

{color:red}ERROR:{color} -1 due to 471 failed/errored test(s), 6294 tests 
executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestAccumuloCliDriver.testCliDriver_accumulo_queries
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_acid_vectorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_char2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_2_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_merge_stats_orc
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_rename_partition_authorization
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_alter_varchar2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_annotate_stats_select
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_array_map_access_nonconstant
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join17
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join19
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join24
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join26
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_smb_mapjoin_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_14
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_15
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_sortmerge_join_9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_decimal
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_decimal_native
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_avro_nullable_fields
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ba_table_udfs
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucket3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_4
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_5
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketsortoptimize_insert_8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_nested_types
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_char_udf1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer13
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_insert_outputformat
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view

[jira] [Commented] (HIVE-6799) HiveServer2 needs to map kerberos name to local name before proxy check

2014-09-19 Thread LINTE (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140444#comment-14140444
 ] 

LINTE commented on HIVE-6799:
-

I comment out hive.metastore.uris in the hive-site.xml and then restart 
hiveserver2 with embedded  metastore and local derby database.

I have many exception for each hive request from knox but it work.







 HiveServer2 needs to map kerberos name to local name before proxy check
 ---

 Key: HIVE-6799
 URL: https://issues.apache.org/jira/browse/HIVE-6799
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.13.1
Reporter: Dilli Arumugam
Assignee: Dilli Arumugam
 Fix For: 0.14.0

 Attachments: HIVE-6799.1.patch, HIVE-6799.2.patch, HIVE-6799.patch


 HiveServer2 does not map kerberos name of authenticated principal to local 
 name.
 Due to this, I get error like the following in HiveServer log:
 Failed to validate proxy privilage of knox/hdps.example.com for sam
 I have KINITED as knox/hdps.example@example.com
 I do have the following in core-site.xml
   property
 namehadoop.proxyuser.knox.groups/name
 valueusers/value
   /property
   property
 namehadoop.proxyuser.knox.hosts/name
 value*/value
   /property



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8115) Hive select query hang when fields contain map

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140455#comment-14140455
 ] 

Hive QA commented on HIVE-8115:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669865/HIVE-8115.2.patch

{color:green}SUCCESS:{color} +1 6293 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/875/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/875/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-875/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669865

 Hive select query hang when fields contain map
 --

 Key: HIVE-8115
 URL: https://issues.apache.org/jira/browse/HIVE-8115
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
 Attachments: HIVE-8115.1.patch, HIVE-8115.2.patch, createTable.hql, 
 data


 Attached the repro of the issue. When creating an table loading the data 
 attached, all hive query with hangs even just select * from the table.
 repro steps:
 1. run createTable.hql
 2. hadoop fs ls -put data /data
 3. LOAD DATA INPATH '/data' OVERWRITE INTO TABLE testtable;
 4. SELECT * FROM testtable;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7647) Beeline does not honor --headerInterval and --color when executing with -e

2014-09-19 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140486#comment-14140486
 ] 

Naveen Gangam commented on HIVE-7647:
-

[~xuefuz] I will attach a new patch today.

 Beeline does not honor --headerInterval and --color when executing with -e
 

 Key: HIVE-7647
 URL: https://issues.apache.org/jira/browse/HIVE-7647
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.14.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-7647.1.patch


 --showHeader is being honored
 [root@localhost ~]# beeline --showHeader=false -u 
 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10;
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 -hiveconf (No such file or directory)
 +--+--++-+
 | 00-  | All Occupations  | 135185230  | 42270   |
 | 11-  | Management occupations   | 6152650| 100310  |
 | 11-1011  | Chief executives | 301930 | 160440  |
 | 11-1021  | General and operations managers  | 1697690| 107970  |
 | 11-1031  | Legislators  | 64650  | 37980   |
 | 11-2011  | Advertising and promotions managers  | 36100  | 94720   |
 | 11-2021  | Marketing managers   | 166790 | 118160  |
 | 11-2022  | Sales managers   | 333910 | 110390  |
 | 11-2031  | Public relations managers| 51730  | 101220  |
 | 11-3011  | Administrative services managers | 246930 | 79500   |
 +--+--++-+
 10 rows selected (0.838 seconds)
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 --outputFormat is being honored.
 [root@localhost ~]# beeline --outputFormat=csv -u 
 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10;
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 'code','description','total_emp','salary'
 '00-','All Occupations','135185230','42270'
 '11-','Management occupations','6152650','100310'
 '11-1011','Chief executives','301930','160440'
 '11-1021','General and operations managers','1697690','107970'
 '11-1031','Legislators','64650','37980'
 '11-2011','Advertising and promotions managers','36100','94720'
 '11-2021','Marketing managers','166790','118160'
 '11-2022','Sales managers','333910','110390'
 '11-2031','Public relations managers','51730','101220'
 '11-3011','Administrative services managers','246930','79500'
 10 rows selected (0.664 seconds)
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 both --color  --headerInterval are being honored when executing using -f 
 option (reads query from a file rather than the commandline) (cannot really 
 see the color here but use the terminal colors)
 [root@localhost ~]# beeline --showheader=true --color=true --headerInterval=5 
 -u 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -f /tmp/tmp.sql  
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 0: jdbc:hive2://localhost select * from sample_07 limit 8;
 +--+--++-+
 |   code   | description  | total_emp  | salary  |
 +--+--++-+
 | 00-  | All Occupations  | 135185230  | 42270   |
 | 11-  | Management occupations   | 6152650| 100310  |
 | 11-1011  | Chief executives | 301930 | 160440  |
 | 11-1021  | General and operations managers  | 1697690| 107970  |
 | 11-1031  | Legislators  | 64650  | 37980   |
 +--+--++-+
 |   code   | description  | total_emp  | salary  |
 +--+--++-+
 | 11-2011  | Advertising and 

[jira] [Commented] (HIVE-7420) Parameterize tests for HCatalog Pig interfaces for testing against all storage formats

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140680#comment-14140680
 ] 

Hive QA commented on HIVE-7420:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669860/HIVE-7420.6.patch

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 6401 tests executed
*Failed tests:*
{noformat}
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[1]
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[2]
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[3]
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[4]
org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes[5]
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/876/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/876/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-876/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669860

 Parameterize tests for HCatalog Pig interfaces for testing against all 
 storage formats
 --

 Key: HIVE-7420
 URL: https://issues.apache.org/jira/browse/HIVE-7420
 Project: Hive
  Issue Type: Sub-task
  Components: HCatalog
Reporter: David Chen
Assignee: David Chen
 Attachments: HIVE-7420-without-HIVE-7457.2.patch, 
 HIVE-7420-without-HIVE-7457.3.patch, HIVE-7420-without-HIVE-7457.4.patch, 
 HIVE-7420-without-HIVE-7457.5.patch, HIVE-7420.1.patch, HIVE-7420.2.patch, 
 HIVE-7420.3.patch, HIVE-7420.4.patch, HIVE-7420.5.patch, HIVE-7420.6.patch


 Currently, HCatalog tests only test against RCFile with a few testing against 
 ORC. The tests should be covering other Hive storage formats as well.
 HIVE-7286 turns HCatMapReduceTest into a test fixture that can be run with 
 all Hive storage formats and with that patch, all test suites built on 
 HCatMapReduceTest are running and passing against Sequence File, Text, and 
 ORC in addition to RCFile.
 Similar changes should be made to make the tests for HCatLoader and 
 HCatStorer generic so that they can be run against all Hive storage formats.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7647) Beeline does not honor --headerInterval and --color when executing with -e

2014-09-19 Thread Naveen Gangam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Naveen Gangam updated HIVE-7647:

Attachment: HIVE-7647.2.patch

 Beeline does not honor --headerInterval and --color when executing with -e
 

 Key: HIVE-7647
 URL: https://issues.apache.org/jira/browse/HIVE-7647
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.14.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-7647.1.patch, HIVE-7647.2.patch


 --showHeader is being honored
 [root@localhost ~]# beeline --showHeader=false -u 
 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10;
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 -hiveconf (No such file or directory)
 +--+--++-+
 | 00-  | All Occupations  | 135185230  | 42270   |
 | 11-  | Management occupations   | 6152650| 100310  |
 | 11-1011  | Chief executives | 301930 | 160440  |
 | 11-1021  | General and operations managers  | 1697690| 107970  |
 | 11-1031  | Legislators  | 64650  | 37980   |
 | 11-2011  | Advertising and promotions managers  | 36100  | 94720   |
 | 11-2021  | Marketing managers   | 166790 | 118160  |
 | 11-2022  | Sales managers   | 333910 | 110390  |
 | 11-2031  | Public relations managers| 51730  | 101220  |
 | 11-3011  | Administrative services managers | 246930 | 79500   |
 +--+--++-+
 10 rows selected (0.838 seconds)
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 --outputFormat is being honored.
 [root@localhost ~]# beeline --outputFormat=csv -u 
 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10;
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 'code','description','total_emp','salary'
 '00-','All Occupations','135185230','42270'
 '11-','Management occupations','6152650','100310'
 '11-1011','Chief executives','301930','160440'
 '11-1021','General and operations managers','1697690','107970'
 '11-1031','Legislators','64650','37980'
 '11-2011','Advertising and promotions managers','36100','94720'
 '11-2021','Marketing managers','166790','118160'
 '11-2022','Sales managers','333910','110390'
 '11-2031','Public relations managers','51730','101220'
 '11-3011','Administrative services managers','246930','79500'
 10 rows selected (0.664 seconds)
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 both --color  --headerInterval are being honored when executing using -f 
 option (reads query from a file rather than the commandline) (cannot really 
 see the color here but use the terminal colors)
 [root@localhost ~]# beeline --showheader=true --color=true --headerInterval=5 
 -u 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -f /tmp/tmp.sql  
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 0: jdbc:hive2://localhost select * from sample_07 limit 8;
 +--+--++-+
 |   code   | description  | total_emp  | salary  |
 +--+--++-+
 | 00-  | All Occupations  | 135185230  | 42270   |
 | 11-  | Management occupations   | 6152650| 100310  |
 | 11-1011  | Chief executives | 301930 | 160440  |
 | 11-1021  | General and operations managers  | 1697690| 107970  |
 | 11-1031  | Legislators  | 64650  | 37980   |
 +--+--++-+
 |   code   | description  | total_emp  | salary  |
 +--+--++-+
 | 11-2011  | Advertising and promotions managers  | 36100  | 94720   |
 

[jira] [Commented] (HIVE-7647) Beeline does not honor --headerInterval and --color when executing with -e

2014-09-19 Thread Naveen Gangam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140695#comment-14140695
 ] 

Naveen Gangam commented on HIVE-7647:
-

[~xuefuz] Patch has been rebased to the latest trunk. Thank you 

 Beeline does not honor --headerInterval and --color when executing with -e
 

 Key: HIVE-7647
 URL: https://issues.apache.org/jira/browse/HIVE-7647
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.14.0
Reporter: Naveen Gangam
Assignee: Naveen Gangam
Priority: Minor
 Fix For: 0.14.0

 Attachments: HIVE-7647.1.patch, HIVE-7647.2.patch


 --showHeader is being honored
 [root@localhost ~]# beeline --showHeader=false -u 
 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10;
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 -hiveconf (No such file or directory)
 +--+--++-+
 | 00-  | All Occupations  | 135185230  | 42270   |
 | 11-  | Management occupations   | 6152650| 100310  |
 | 11-1011  | Chief executives | 301930 | 160440  |
 | 11-1021  | General and operations managers  | 1697690| 107970  |
 | 11-1031  | Legislators  | 64650  | 37980   |
 | 11-2011  | Advertising and promotions managers  | 36100  | 94720   |
 | 11-2021  | Marketing managers   | 166790 | 118160  |
 | 11-2022  | Sales managers   | 333910 | 110390  |
 | 11-2031  | Public relations managers| 51730  | 101220  |
 | 11-3011  | Administrative services managers | 246930 | 79500   |
 +--+--++-+
 10 rows selected (0.838 seconds)
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 --outputFormat is being honored.
 [root@localhost ~]# beeline --outputFormat=csv -u 
 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -e select * from sample_07 limit 10;
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 'code','description','total_emp','salary'
 '00-','All Occupations','135185230','42270'
 '11-','Management occupations','6152650','100310'
 '11-1011','Chief executives','301930','160440'
 '11-1021','General and operations managers','1697690','107970'
 '11-1031','Legislators','64650','37980'
 '11-2011','Advertising and promotions managers','36100','94720'
 '11-2021','Marketing managers','166790','118160'
 '11-2022','Sales managers','333910','110390'
 '11-2031','Public relations managers','51730','101220'
 '11-3011','Administrative services managers','246930','79500'
 10 rows selected (0.664 seconds)
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 Closing: org.apache.hive.jdbc.HiveConnection
 both --color  --headerInterval are being honored when executing using -f 
 option (reads query from a file rather than the commandline) (cannot really 
 see the color here but use the terminal colors)
 [root@localhost ~]# beeline --showheader=true --color=true --headerInterval=5 
 -u 'jdbc:hive2://localhost:1/default' -n hive -d 
 org.apache.hive.jdbc.HiveDriver -f /tmp/tmp.sql  
 Connecting to jdbc:hive2://localhost:1/default
 Connected to: Apache Hive (version 0.12.0-cdh5.0.1)
 Driver: Hive JDBC (version 0.12.0-cdh5.0.1)
 Transaction isolation: TRANSACTION_REPEATABLE_READ
 Beeline version 0.12.0-cdh5.1.0 by Apache Hive
 0: jdbc:hive2://localhost select * from sample_07 limit 8;
 +--+--++-+
 |   code   | description  | total_emp  | salary  |
 +--+--++-+
 | 00-  | All Occupations  | 135185230  | 42270   |
 | 11-  | Management occupations   | 6152650| 100310  |
 | 11-1011  | Chief executives | 301930 | 160440  |
 | 11-1021  | General and operations managers  | 1697690| 107970  |
 | 11-1031  | Legislators  | 64650  | 37980   |
 +--+--++-+
 |   code   | description  | total_emp  | salary  |
 

[jira] [Commented] (HIVE-7689) Enable Postgres as METASTORE back-end

2014-09-19 Thread Alan Gates (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140707#comment-14140707
 ] 

Alan Gates commented on HIVE-7689:
--

All these calls to getEscape make the code hard to read.  If postgres requires 
lower case table and column names I'd prefer to change the postgres version of 
hive-txn-schema.sql to create the tables and columns with lower case names.  
Wouldn't that be easier?

 Enable Postgres as METASTORE back-end
 -

 Key: HIVE-7689
 URL: https://issues.apache.org/jira/browse/HIVE-7689
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Affects Versions: 0.14.0
Reporter: Damien Carol
Assignee: Damien Carol
Priority: Minor
  Labels: metastore, postgres
 Fix For: 0.14.0

 Attachments: HIVE-7689.5.patch, HIVE-7689.6.patch, HIVE-7689.7.patch, 
 HIVE-7689.8.patch, HIVE-7889.1.patch, HIVE-7889.2.patch, HIVE-7889.3.patch, 
 HIVE-7889.4.patch


 I maintain few patches to make Metastore works with Postgres back end in our 
 production environment.
 The main goal of this JIRA is to push upstream these patches.
 This patch enable LOCKS, COMPACTION and fix error in STATS on postgres 
 metastore.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7359) Stats based compute query replies fail to do simple column transforms

2014-09-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7359:
---
   Resolution: Fixed
Fix Version/s: 0.14.0
   Status: Resolved  (was: Patch Available)

Committed to trunk.

 Stats based compute query replies fail to do simple column transforms
 -

 Key: HIVE-7359
 URL: https://issues.apache.org/jira/browse/HIVE-7359
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.13.0, 0.14.0, 0.13.1
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Fix For: 0.14.0

 Attachments: HIVE-7359.patch


 The following two queries return the same answer (the second one is incorrect)
 {code}
 hive set hive.compute.query.using.stats=true;
 hive select count(1) from trips;
 OK
 187271461
 Time taken: 0.173 seconds, Fetched: 1 row(s)
 hive select count(1)/5109828 from trips;
 OK
 187271461
 Time taken: 0.125 seconds, Fetched: 1 row(s)
 {code}
 The second query should have output 36.649 instead of the returning the value 
 of count(1).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7359) Stats based compute query replies fail to do simple column transforms

2014-09-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-7359:
---
Component/s: Logical Optimizer

 Stats based compute query replies fail to do simple column transforms
 -

 Key: HIVE-7359
 URL: https://issues.apache.org/jira/browse/HIVE-7359
 Project: Hive
  Issue Type: Bug
  Components: Logical Optimizer
Affects Versions: 0.13.0, 0.14.0, 0.13.1
Reporter: Gopal V
Assignee: Ashutosh Chauhan
 Fix For: 0.14.0

 Attachments: HIVE-7359.patch


 The following two queries return the same answer (the second one is incorrect)
 {code}
 hive set hive.compute.query.using.stats=true;
 hive select count(1) from trips;
 OK
 187271461
 Time taken: 0.173 seconds, Fetched: 1 row(s)
 hive select count(1)/5109828 from trips;
 OK
 187271461
 Time taken: 0.125 seconds, Fetched: 1 row(s)
 {code}
 The second query should have output 36.649 instead of the returning the value 
 of count(1).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7980) Hive on spark issue..

2014-09-19 Thread Xuefu Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140778#comment-14140778
 ] 

Xuefu Zhang commented on HIVE-7980:
---

[~alton.jung] For hive, you need the latest from Spark branch. For Spark, you 
can also have the latest in their master branch. Since both are in the 
development, issues can arrive. Could you describe what you are trying to do 
and how to reproduce your issue(s)? Thanks.

 Hive on spark issue..
 -

 Key: HIVE-7980
 URL: https://issues.apache.org/jira/browse/HIVE-7980
 Project: Hive
  Issue Type: Bug
  Components: HiveServer2, Spark
Affects Versions: spark-branch
 Environment: Test Environment is..
 . hive 0.14.0(spark branch version)
 . spark 
 (http://ec2-50-18-79-139.us-west-1.compute.amazonaws.com/data/spark-assembly-1.1.0-SNAPSHOT-hadoop2.3.0.jar)
 . hadoop 2.4.0 (yarn)
Reporter: alton.jung
Assignee: Chao
 Fix For: spark-branch


 .I followed this 
 guide(https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started).
  and i compiled hive from spark branch. in the next step i met the below 
 error..
 (*i typed the hive query on beeline, i used the  simple query using order 
 by to invoke the palleral works 
ex) select * from test where id = 1 order by id;
 )
 [Error list is]
 2014-09-04 02:58:08,796 ERROR spark.SparkClient 
 (SparkClient.java:execute(158)) - Error generating Spark Plan
 java.lang.NullPointerException
   at 
 org.apache.spark.SparkContext.defaultParallelism(SparkContext.scala:1262)
   at 
 org.apache.spark.SparkContext.defaultMinPartitions(SparkContext.scala:1269)
   at 
 org.apache.spark.SparkContext.hadoopRDD$default$5(SparkContext.scala:537)
   at 
 org.apache.spark.api.java.JavaSparkContext.hadoopRDD(JavaSparkContext.scala:318)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generateRDD(SparkPlanGenerator.java:160)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkPlanGenerator.generate(SparkPlanGenerator.java:88)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkClient.execute(SparkClient.java:156)
   at 
 org.apache.hadoop.hive.ql.exec.spark.session.SparkSessionImpl.submit(SparkSessionImpl.java:52)
   at 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask.execute(SparkTask.java:77)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:161)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
   at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:72)
 2014-09-04 02:58:11,108 ERROR ql.Driver (SessionState.java:printError(696)) - 
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask
 2014-09-04 02:58:11,182 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=Driver.execute 
 start=1409824527954 end=1409824691182 duration=163228 
 from=org.apache.hadoop.hive.ql.Driver
 2014-09-04 02:58:11,223 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogBegin(108)) - PERFLOG method=releaseLocks 
 from=org.apache.hadoop.hive.ql.Driver
 2014-09-04 02:58:11,224 INFO  log.PerfLogger 
 (PerfLogger.java:PerfLogEnd(135)) - /PERFLOG method=releaseLocks 
 start=1409824691223 end=1409824691224 duration=1 
 from=org.apache.hadoop.hive.ql.Driver
 2014-09-04 02:58:11,306 ERROR operation.Operation 
 (SQLOperation.java:run(199)) - Error running hive query: 
 org.apache.hive.service.cli.HiveSQLException: Error while processing 
 statement: FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.spark.SparkTask
   at 
 org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:284)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:146)
   at 
 org.apache.hive.service.cli.operation.SQLOperation.access$100(SQLOperation.java:69)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:196)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556)
   at 
 org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:508)
   at 
 org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:208)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at 

[jira] [Created] (HIVE-8189) A select statement with a subquery is failing.

2014-09-19 Thread Yongzhi Chen (JIRA)
Yongzhi Chen created HIVE-8189:
--

 Summary: A select statement with a subquery is failing. 
 Key: HIVE-8189
 URL: https://issues.apache.org/jira/browse/HIVE-8189
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1, 0.12.0
Reporter: Yongzhi Chen


Hive tables in the query are hbase tables, and the subquery is a join statement.
When
set hive.optimize.ppd=true;
  and
set hive.auto.convert.join=false;
The query does not return data. 
While hive.optimize.ppd=true and hive.auto.convert.join=true return values 
back. See attached query file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8094) add LIKE keyword support for SHOW FUNCTIONS

2014-09-19 Thread peter liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

peter liu updated HIVE-8094:

Attachment: HIVE-8094.2.patch

 add LIKE keyword support for SHOW FUNCTIONS
 ---

 Key: HIVE-8094
 URL: https://issues.apache.org/jira/browse/HIVE-8094
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.14.0, 0.13.1
Reporter: peter liu
Assignee: peter liu
 Fix For: 0.14.0

 Attachments: HIVE-8094.1.patch, HIVE-8094.2.patch


 It would be nice to  add LIKE keyword support for SHOW FUNCTIONS as below, 
 and keep the patterns consistent to the way as SHOW DATABASES, SHOW TABLES.
 bq. SHOW FUNCTIONS LIKE 'foo*';



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8189) A select statement with a subquery is failing.

2014-09-19 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8189:
---
Attachment: hbase_ppd_join.q

The query can reproduce the issue. 

 A select statement with a subquery is failing. 
 ---

 Key: HIVE-8189
 URL: https://issues.apache.org/jira/browse/HIVE-8189
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.13.1
Reporter: Yongzhi Chen
 Attachments: hbase_ppd_join.q


 Hive tables in the query are hbase tables, and the subquery is a join 
 statement.
 When
 set hive.optimize.ppd=true;
   and
 set hive.auto.convert.join=false;
 The query does not return data. 
 While hive.optimize.ppd=true and hive.auto.convert.join=true return values 
 back. See attached query file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-7100) Users of hive should be able to specify skipTrash when dropping tables.

2014-09-19 Thread david serafini (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140786#comment-14140786
 ] 

david serafini commented on HIVE-7100:
--

No.  It hasn't.  I've looked at the code a little, but I haven't found an
answer yet. I'm a novice with the hive code, and I didn't do the original
work on this ticket - I'm just trying to get it finished.   I'll probably
need to find time to find or write a test case to verify the behavior.

On Wed, Sep 17, 2014 at 10:27 PM, Lefty Leverenz (JIRA) j...@apache.org



 Users of hive should be able to specify skipTrash when dropping tables.
 ---

 Key: HIVE-7100
 URL: https://issues.apache.org/jira/browse/HIVE-7100
 Project: Hive
  Issue Type: Improvement
Affects Versions: 0.13.0
Reporter: Ravi Prakash
Assignee: david serafini
 Attachments: HIVE-7100.1.patch, HIVE-7100.10.patch, 
 HIVE-7100.2.patch, HIVE-7100.3.patch, HIVE-7100.4.patch, HIVE-7100.5.patch, 
 HIVE-7100.8.patch, HIVE-7100.9.patch, HIVE-7100.patch


 Users of our clusters are often running up against their quota limits because 
 of Hive tables. When they drop tables, they have to then manually delete the 
 files from HDFS using skipTrash. This is cumbersome and unnecessary. We 
 should enable users to skipTrash directly when dropping tables.
 We should also be able to provide this functionality without polluting SQL 
 syntax.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Edit access to the Hive Wiki

2014-09-19 Thread Pavan Lanka
Hi,

I have registered myself as pavibhai on the https://cwiki.apache.org and
would like change privilege so that I can contribute to the content.


Regards,
Pavan


Re: Edit access to the Hive Wiki

2014-09-19 Thread Xuefu Zhang
Done!

On Fri, Sep 19, 2014 at 8:55 AM, Pavan Lanka pavib...@gmail.com wrote:

 Hi,

 I have registered myself as pavibhai on the https://cwiki.apache.org and
 would like change privilege so that I can contribute to the content.


 Regards,
 Pavan



[jira] [Commented] (HIVE-8162) hive.optimize.sort.dynamic.partition causes RuntimeException for inserting into dynamic partitioned table when map function is used in the subquery

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140814#comment-14140814
 ] 

Hive QA commented on HIVE-8162:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669863/HIVE-8162.1.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6293 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_infer_bucket_sort_dyn_part
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/877/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/877/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-877/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669863

 hive.optimize.sort.dynamic.partition causes RuntimeException for inserting 
 into dynamic partitioned table when map function is used in the subquery 
 

 Key: HIVE-8162
 URL: https://issues.apache.org/jira/browse/HIVE-8162
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Na Yang
Assignee: Prasanth J
 Attachments: 47rows.txt, HIVE-8162.1.patch


 Exception:
 Diagnostic Messages for this Task:
 java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: 
 Hive Runtime Error: Unable to deserialize reduce input key from 
 x1x129x51x83x14x1x128x0x0x2x1x1x1x120x95x112x114x111x100x117x99x116x95x105x100x0x1x0x0x255
  with properties {columns=reducesinkkey0,reducesinkkey1,reducesinkkey2, 
 serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
  serialization.sort.order=+++, columns.types=int,mapstring,string,int}
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:283)
   at 
 org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:518)
   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:462)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:282)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1122)
   at org.apache.hadoop.mapred.Child.main(Child.java:271)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime 
 Error: Unable to deserialize reduce input key from 
 x1x129x51x83x14x1x128x0x0x2x1x1x1x120x95x112x114x111x100x117x99x116x95x105x100x0x1x0x0x255
  with properties {columns=reducesinkkey0,reducesinkkey1,reducesinkkey2, 
 serialization.lib=org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe,
  serialization.sort.order=+++, columns.types=int,mapstring,string,int}
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:222)
   ... 7 more
 Caused by: org.apache.hadoop.hive.serde2.SerDeException: java.io.EOFException
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:189)
   at 
 org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:220)
   ... 7 more
 Caused by: java.io.EOFException
   at 
 org.apache.hadoop.hive.serde2.binarysortable.InputByteBuffer.read(InputByteBuffer.java:54)
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserializeInt(BinarySortableSerDe.java:533)
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:236)
   at 
 org.apache.hadoop.hive.serde2.binarysortable.BinarySortableSerDe.deserialize(BinarySortableSerDe.java:185)
   ... 8 more
 Step to reproduce the exception:
 -
 CREATE TABLE associateddata(creative_id int,creative_group_id int,placement_id
 int,sm_campaign_id int,browser_id string, trans_type_p string,trans_time_p
 string,group_name string,event_name string,order_id string,revenue
 float,currency string, trans_type_ci string,trans_time_ci string,f16
 mapstring,string,campaign_id int,user_agent_cat string,geo_country
 string,geo_city string,geo_state string,geo_zip string,geo_dma string,geo_area
 string,geo_isp 

[jira] [Created] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)
LINTE created HIVE-8190:
---

 Summary: LDAP user match for authentication on hiveserver2
 Key: HIVE-8190
 URL: https://issues.apache.org/jira/browse/HIVE-8190
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Clients
Affects Versions: 0.13.1
 Environment: Centos 6.5
Reporter: LINTE


Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :
1/ uid : 
 - uid={0},basedn

2/ or cn :
- cn={0},basedn






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : uid={0}, basedn

or cn : cn={0}, basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : uid={0},basedn

or cn : cn={0},basedn





 LDAP user match for authentication on hiveserver2
 -

 Key: HIVE-8190
 URL: https://issues.apache.org/jira/browse/HIVE-8190
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Clients
Affects Versions: 0.13.1
 Environment: Centos 6.5
Reporter: LINTE

 Some LDAP has the user composant as CN and not UID.
 SO when you try to authenticate the LDAP authentication module of hive try to 
 authenticate with the following string :  
 uid=$login,basedn
 Some AD have user objects that are not uid but cn, so it is be important to 
 personalize the kind of objects that the authentication moduel look for in 
 ldap.
 We can see an exemple in knox LDAP module configuration the parameter 
 main.ldapRealm.userDnTemplate can be configured to look for :
 uid : uid={0}, basedn
 or cn : cn={0}, basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : uid={0},basedn

or cn : cn={0},basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : 
-uid={0},basedn

or cn :
-cn={0},basedn





 LDAP user match for authentication on hiveserver2
 -

 Key: HIVE-8190
 URL: https://issues.apache.org/jira/browse/HIVE-8190
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Clients
Affects Versions: 0.13.1
 Environment: Centos 6.5
Reporter: LINTE

 Some LDAP has the user composant as CN and not UID.
 SO when you try to authenticate the LDAP authentication module of hive try to 
 authenticate with the following string :  
 uid=$login,basedn
 Some AD have user objects that are not uid but cn, so it is be important to 
 personalize the kind of objects that the authentication moduel look for in 
 ldap.
 We can see an exemple in knox LDAP module configuration the parameter 
 main.ldapRealm.userDnTemplate can be configured to look for :
 uid : uid={0},basedn
 or cn : cn={0},basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :


uid : 
uid={0}, basedn

or cn :
cn={0}, basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid == uid= {0}, basedn


uid : uid={0}, basedn

or cn : cn={0}, basedn





 LDAP user match for authentication on hiveserver2
 -

 Key: HIVE-8190
 URL: https://issues.apache.org/jira/browse/HIVE-8190
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Clients
Affects Versions: 0.13.1
 Environment: Centos 6.5
Reporter: LINTE

 Some LDAP has the user composant as CN and not UID.
 SO when you try to authenticate the LDAP authentication module of hive try to 
 authenticate with the following string :  
 uid=$login,basedn
 Some AD have user objects that are not uid but cn, so it is be important to 
 personalize the kind of objects that the authentication moduel look for in 
 ldap.
 We can see an exemple in knox LDAP module configuration the parameter 
 main.ldapRealm.userDnTemplate can be configured to look for :
 uid : 
 uid={0}, basedn
 or cn :
 cn={0}, basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid == uid= {0}, basedn


uid : uid={0}, basedn

or cn : cn={0}, basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid == uid={0}, basedn


uid : uid={0}, basedn

or cn : cn={0}, basedn





 LDAP user match for authentication on hiveserver2
 -

 Key: HIVE-8190
 URL: https://issues.apache.org/jira/browse/HIVE-8190
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Clients
Affects Versions: 0.13.1
 Environment: Centos 6.5
Reporter: LINTE

 Some LDAP has the user composant as CN and not UID.
 SO when you try to authenticate the LDAP authentication module of hive try to 
 authenticate with the following string :  
 uid=$login,basedn
 Some AD have user objects that are not uid but cn, so it is be important to 
 personalize the kind of objects that the authentication moduel look for in 
 ldap.
 We can see an exemple in knox LDAP module configuration the parameter 
 main.ldapRealm.userDnTemplate can be configured to look for :
 uid == uid= {0}, basedn
 uid : uid={0}, basedn
 or cn : cn={0}, basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : 
-uid={0},basedn

or cn :
-cn={0},basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :
1/ uid : 
 - uid={0},basedn

2/ or cn :
- cn={0},basedn





 LDAP user match for authentication on hiveserver2
 -

 Key: HIVE-8190
 URL: https://issues.apache.org/jira/browse/HIVE-8190
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Clients
Affects Versions: 0.13.1
 Environment: Centos 6.5
Reporter: LINTE

 Some LDAP has the user composant as CN and not UID.
 SO when you try to authenticate the LDAP authentication module of hive try to 
 authenticate with the following string :  
 uid=$login,basedn
 Some AD have user objects that are not uid but cn, so it is be important to 
 personalize the kind of objects that the authentication moduel look for in 
 ldap.
 We can see an exemple in knox LDAP module configuration the parameter 
 main.ldapRealm.userDnTemplate can be configured to look for :
 uid : 
 -uid={0},basedn
 or cn :
 -cn={0},basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :


uid : 'uid={0}, basedn'

or cn : 'cn={0}, basedn'




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :


uid : uid={0}, basedn

or cn : cn={0}, basedn





 LDAP user match for authentication on hiveserver2
 -

 Key: HIVE-8190
 URL: https://issues.apache.org/jira/browse/HIVE-8190
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Clients
Affects Versions: 0.13.1
 Environment: Centos 6.5
Reporter: LINTE

 Some LDAP has the user composant as CN and not UID.
 SO when you try to authenticate the LDAP authentication module of hive try to 
 authenticate with the following string :  
 uid=$login,basedn
 Some AD have user objects that are not uid but cn, so it is be important to 
 personalize the kind of objects that the authentication moduel look for in 
 ldap.
 We can see an exemple in knox LDAP module configuration the parameter 
 main.ldapRealm.userDnTemplate can be configured to look for :
 uid : 'uid={0}, basedn'
 or cn : 'cn={0}, basedn'



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :


uid : uid={0}, basedn

or cn : cn={0}, basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :


uid : 
uid={0}, basedn

or cn :
cn={0}, basedn





 LDAP user match for authentication on hiveserver2
 -

 Key: HIVE-8190
 URL: https://issues.apache.org/jira/browse/HIVE-8190
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Clients
Affects Versions: 0.13.1
 Environment: Centos 6.5
Reporter: LINTE

 Some LDAP has the user composant as CN and not UID.
 SO when you try to authenticate the LDAP authentication module of hive try to 
 authenticate with the following string :  
 uid=$login,basedn
 Some AD have user objects that are not uid but cn, so it is be important to 
 personalize the kind of objects that the authentication moduel look for in 
 ldap.
 We can see an exemple in knox LDAP module configuration the parameter 
 main.ldapRealm.userDnTemplate can be configured to look for :
 uid : uid={0}, basedn
 or cn : cn={0}, basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8190) LDAP user match for authentication on hiveserver2

2014-09-19 Thread LINTE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LINTE updated HIVE-8190:

Description: 
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid == uid={0}, basedn


uid : uid={0}, basedn

or cn : cn={0}, basedn




  was:
Some LDAP has the user composant as CN and not UID.

SO when you try to authenticate the LDAP authentication module of hive try to 
authenticate with the following string :  

uid=$login,basedn

Some AD have user objects that are not uid but cn, so it is be important to 
personalize the kind of objects that the authentication moduel look for in ldap.

We can see an exemple in knox LDAP module configuration the parameter 
main.ldapRealm.userDnTemplate can be configured to look for :

uid : uid={0}, basedn

or cn : cn={0}, basedn





 LDAP user match for authentication on hiveserver2
 -

 Key: HIVE-8190
 URL: https://issues.apache.org/jira/browse/HIVE-8190
 Project: Hive
  Issue Type: Improvement
  Components: Authorization, Clients
Affects Versions: 0.13.1
 Environment: Centos 6.5
Reporter: LINTE

 Some LDAP has the user composant as CN and not UID.
 SO when you try to authenticate the LDAP authentication module of hive try to 
 authenticate with the following string :  
 uid=$login,basedn
 Some AD have user objects that are not uid but cn, so it is be important to 
 personalize the kind of objects that the authentication moduel look for in 
 ldap.
 We can see an exemple in knox LDAP module configuration the parameter 
 main.ldapRealm.userDnTemplate can be configured to look for :
 uid == uid={0}, basedn
 uid : uid={0}, basedn
 or cn : cn={0}, basedn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Apply for Hive contributor

2014-09-19 Thread Yongzhi Chen
Hi,
I'd like to be a hive contributer, my JIRA account ID is ychena

Thanks

Yongzhi


[jira] [Updated] (HIVE-8189) A select statement with a subquery is failing with HBaseSerde

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8189:
---
Summary: A select statement with a subquery is failing with HBaseSerde  
(was: A select statement with a subquery is failing. )

 A select statement with a subquery is failing with HBaseSerde
 -

 Key: HIVE-8189
 URL: https://issues.apache.org/jira/browse/HIVE-8189
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.13.1
Reporter: Yongzhi Chen
 Attachments: hbase_ppd_join.q


 Hive tables in the query are hbase tables, and the subquery is a join 
 statement.
 When
 set hive.optimize.ppd=true;
   and
 set hive.auto.convert.join=false;
 The query does not return data. 
 While hive.optimize.ppd=true and hive.auto.convert.join=true return values 
 back. See attached query file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-5201) Create new initial rev of new hive site in staging

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-5201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland resolved HIVE-5201.

Resolution: Fixed

 Create new initial rev of new hive site in staging
 --

 Key: HIVE-5201
 URL: https://issues.apache.org/jira/browse/HIVE-5201
 Project: Hive
  Issue Type: Sub-task
Reporter: Brock Noland
Assignee: Brock Noland





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-4938) Update website to use Apache CMS

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland resolved HIVE-4938.

Resolution: Fixed

 Update website to use Apache CMS
 

 Key: HIVE-4938
 URL: https://issues.apache.org/jira/browse/HIVE-4938
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland

 A 
 [vote|http://mail-archives.apache.org/mod_mbox/hive-dev/201307.mbox/%3CCAENxBwx47KQsFRBbBB-i3y1VovBwA8E2dymsfcenkb7X5vhVnQ%40mail.gmail.com%3E]
  was held and we decided to move from Apache Forrest to Apache CMS for the 
 website. This is an uber ticket to track this effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8100) Add QTEST_LEAVE_FILES to QTestUtil

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8100:
---
Description: Basically the idea here is to have an option to not delete the 
warehouse directory. I am using an env variable so it's always passed to all 
sub-processes.

 Add QTEST_LEAVE_FILES to QTestUtil
 --

 Key: HIVE-8100
 URL: https://issues.apache.org/jira/browse/HIVE-8100
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8100.patch, HIVE-8100.patch


 Basically the idea here is to have an option to not delete the warehouse 
 directory. I am using an env variable so it's always passed to all 
 sub-processes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8138) Global Init file should allow specifying file name not only directory

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8138:
---
Attachment: HIVE-8138.patch

 Global Init file should allow specifying file name  not only directory
 --

 Key: HIVE-8138
 URL: https://issues.apache.org/jira/browse/HIVE-8138
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8138.patch, HIVE-8138.patch


 HIVE-5160 allows you to specify a directory where a .hiverc file exists. 
 However since .hiverc is a hidden file this can be confusing. The property 
 should allow a path to a file or a directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 25834: HIVE-8138 - Global Init file should allow specifying file name not only directory

2014-09-19 Thread Brock Noland

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25834/
---

Review request for hive.


Repository: hive-git


Description
---

Allows either a file or dir.


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3a045b7 
  service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java 
5231d5e 
  
service/src/test/org/apache/hive/service/cli/session/TestSessionGlobalInitFile.java
 5b1cbc0 

Diff: https://reviews.apache.org/r/25834/diff/


Testing
---


Thanks,

Brock Noland



[jira] [Commented] (HIVE-8138) Global Init file should allow specifying file name not only directory

2014-09-19 Thread Brock Noland (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140924#comment-14140924
 ] 

Brock Noland commented on HIVE-8138:


https://reviews.apache.org/r/25834/

 Global Init file should allow specifying file name  not only directory
 --

 Key: HIVE-8138
 URL: https://issues.apache.org/jira/browse/HIVE-8138
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8138.patch, HIVE-8138.patch


 HIVE-5160 allows you to specify a directory where a .hiverc file exists. 
 However since .hiverc is a hidden file this can be confusing. The property 
 should allow a path to a file or a directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8115) Hive select query hang when fields contain map

2014-09-19 Thread Xiaobing Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaobing Zhou updated HIVE-8115:

Description: 
Attached the repro of the issue. When creating an table loading the data 
attached, all hive query with hangs even just select * from the table.

repro steps:
1. run createTable.hql
2. hadoop fs -put data /data
3. LOAD DATA INPATH '/data' OVERWRITE INTO TABLE testtable;
4. SELECT * FROM testtable;

  was:
Attached the repro of the issue. When creating an table loading the data 
attached, all hive query with hangs even just select * from the table.

repro steps:
1. run createTable.hql
2. hadoop fs ls -put data /data
3. LOAD DATA INPATH '/data' OVERWRITE INTO TABLE testtable;
4. SELECT * FROM testtable;


 Hive select query hang when fields contain map
 --

 Key: HIVE-8115
 URL: https://issues.apache.org/jira/browse/HIVE-8115
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.0
Reporter: Xiaobing Zhou
Assignee: Xiaobing Zhou
 Attachments: HIVE-8115.1.patch, HIVE-8115.2.patch, createTable.hql, 
 data


 Attached the repro of the issue. When creating an table loading the data 
 attached, all hive query with hangs even just select * from the table.
 repro steps:
 1. run createTable.hql
 2. hadoop fs -put data /data
 3. LOAD DATA INPATH '/data' OVERWRITE INTO TABLE testtable;
 4. SELECT * FROM testtable;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8111) CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

2014-09-19 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140929#comment-14140929
 ] 

Sergey Shelukhin commented on HIVE-8111:


ping? [~ashutoshc] [~jpullokkaran]

 CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO
 

 Key: HIVE-8111
 URL: https://issues.apache.org/jira/browse/HIVE-8111
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-8111.patch


 Original test failure: looks like column type changes to different decimals 
 in most cases. In one case it causes the integer part to be too big to fit, 
 so the result becomes null it seems.
 What happens is that CBO adds casts to arithmetic expressions to make them 
 type compatible; these casts become part of new AST, and then Hive adds casts 
 on top of these casts. This (the first part) also causes lots of out file 
 changes. It's not clear how to best fix it so far, in addition to incorrect 
 decimal width and sometimes nulls when width is larger than allowed in Hive.
 Option one - don't add those for numeric ops - cannot be done if numeric op 
 is a part of compare, for which CBO needs correct types.
 Option two - unwrap casts when determining type in Hive - hard or impossible 
 to tell apart CBO-added casts and user casts. 
 Option three - don't change types in Hive if CBO has run - seems hacky and 
 hard to ensure it's applied everywhere.
 Option four - map all expressions precisely between two trees and remove 
 casts again after optimization, will be pretty difficult.
 Option five - somehow mark those casts. Not sure about how yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8187) CBO: Change Optiq Type System Precision/scale to use Hive Type System Precision/Scale

2014-09-19 Thread Laljo John Pullokkaran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Laljo John Pullokkaran updated HIVE-8187:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

 CBO: Change Optiq Type System Precision/scale to use Hive Type System 
 Precision/Scale
 -

 Key: HIVE-8187
 URL: https://issues.apache.org/jira/browse/HIVE-8187
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Laljo John Pullokkaran
Assignee: Laljo John Pullokkaran
 Attachments: HIVE-8187.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8105) booleans and nulls not handled properly in insert/values

2014-09-19 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140938#comment-14140938
 ] 

Eugene Koifman commented on HIVE-8105:
--

It would be useful to add some comments in unparseExprForValuesClause() about 
NULL and FALSE handling.  I don't think it would be clear why this works.
Otherwise
+1 pending tests


 booleans and nulls not handled properly in insert/values
 

 Key: HIVE-8105
 URL: https://issues.apache.org/jira/browse/HIVE-8105
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Attachments: HIVE-8105.2.patch, HIVE-8105.2.patch, HIVE-8105.patch


 Doing an insert/values with a boolean always results in a value of true, 
 regardless of whether true or false is given in the query.
 Doing an insert/values with a null for a column value results in a semantic 
 error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8149) hive.optimize.reducededuplication should be set to false for IUD ops

2014-09-19 Thread Eugene Koifman (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140952#comment-14140952
 ] 

Eugene Koifman commented on HIVE-8149:
--

+1 pending tests

 hive.optimize.reducededuplication should be set to false for IUD ops
 

 Key: HIVE-8149
 URL: https://issues.apache.org/jira/browse/HIVE-8149
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Eugene Koifman
Assignee: Alan Gates
 Attachments: HIVE-8149.patch


 this optimizer causes both old and new rows to show up in a select after 
 update (for tables involving few rows)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25394: HIVE-7503: Support Hive's multi-table insert query with Spark [Spark Branch]

2014-09-19 Thread Xuefu Zhang

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25394/#review53871
---


Nice work.

Besides comment below, I think there are some improvement can be done, either 
here or in a different patch:

1. If we have a module that can compile an op tree (given by top ops) into a 
spark task, then we can reuse it after the original op tree is broken into 
several trees. From each tree, we compile it into a spark task. In the end, we 
hook up parent child relation ship. The current logic is a little complicated 
and hard to understand.
2. Tests 
3. Optimizations


ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java
https://reviews.apache.org/r/25394/#comment93732

maybe we can call it opToParentMap?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java
https://reviews.apache.org/r/25394/#comment93733

Comment here?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java
https://reviews.apache.org/r/25394/#comment93736

We should be able to reuse the hash map by emptyping the previous one.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java
https://reviews.apache.org/r/25394/#comment93817

Let's use meaningful variable names even though they are local.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java
https://reviews.apache.org/r/25394/#comment93820

I feel that the logic here can be simplified. Could we just pop all paths 
and then check if the root is the same and keep doing so until the common 
parent is found?



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java
https://reviews.apache.org/r/25394/#comment93821

This seems covering only the case where all FSs have a commont FORWARD 
parent. What if only some of them sharing a FORWARD parent, but other FSs and 
the FORWARD operator sharing some common parent?

I think the rule for whether to break the plan goes like this:

A plan needs to be broken if and only if there are more than one 
FileSinkOperator that can be traced back to a common parent and the tracing has 
to pass a ReduceSinkOperator on the way.



ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java
https://reviews.apache.org/r/25394/#comment93847

Here we are mapping the children of lca to lca itself. Why is this 
necessary, as you can find the chidren of lca later without the map. Cannot we 
just store lca here?


- Xuefu Zhang


On Sept. 18, 2014, 6:38 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25394/
 ---
 
 (Updated Sept. 18, 2014, 6:38 p.m.)
 
 
 Review request for hive, Brock Noland and Xuefu Zhang.
 
 
 Bugs: HIVE-7503
 https://issues.apache.org/jira/browse/HIVE-7503
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 For Hive's multi insert query 
 (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML), there 
 may be an MR job for each insert. When we achieve this with Spark, it would 
 be nice if all the inserts can happen concurrently.
 It seems that this functionality isn't available in Spark. To make things 
 worse, the source of the insert may be re-computed unless it's staged. Even 
 with this, the inserts will happen sequentially, making the performance 
 suffer.
 This task is to find out what takes in Spark to enable this without requiring 
 staging the source and sequential insertion. If this has to be solved in 
 Hive, find out an optimum way to do this.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
 4211a0703f5b6bfd8a628b13864fac75ef4977cf 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
 695d8b90cb1989805a7ff4e39a9635bbcea9c66c 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java 
 864965e03a3f9d665e21e1c1b10b19dc286b842f 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 76fc290f00430dbc34dbbc1a0cef0d0eb59e6029 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMergeTaskProcessor.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMultiInsertionProcessor.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java
  5fcaf643a0e90fc4acc21187f6d78cefdb1b691a 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25394/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Chao Sun
 




[jira] [Updated] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-09-19 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-7482:
-
Attachment: HIVE-7482.6.patch

Fix for failing tests in map reduce.

 The execution side changes for SMB join in hive-tez
 ---

 Key: HIVE-7482
 URL: https://issues.apache.org/jira/browse/HIVE-7482
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: tez-branch
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-7482.1.patch, HIVE-7482.2.patch, HIVE-7482.3.patch, 
 HIVE-7482.4.patch, HIVE-7482.5.patch, HIVE-7482.6.patch, 
 HIVE-7482.WIP.2.patch, HIVE-7482.WIP.3.patch, HIVE-7482.WIP.4.patch, 
 HIVE-7482.WIP.patch


 A piece of HIVE-7430.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-09-19 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-7482:
-
Status: Patch Available  (was: Open)

 The execution side changes for SMB join in hive-tez
 ---

 Key: HIVE-7482
 URL: https://issues.apache.org/jira/browse/HIVE-7482
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: tez-branch
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-7482.1.patch, HIVE-7482.2.patch, HIVE-7482.3.patch, 
 HIVE-7482.4.patch, HIVE-7482.5.patch, HIVE-7482.6.patch, 
 HIVE-7482.WIP.2.patch, HIVE-7482.WIP.3.patch, HIVE-7482.WIP.4.patch, 
 HIVE-7482.WIP.patch


 A piece of HIVE-7430.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8185) hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in build

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140956#comment-14140956
 ] 

Hive QA commented on HIVE-8185:
---



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669892/HIVE-8185.2.patch

{color:green}SUCCESS:{color} +1 6292 tests passed

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/878/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/878/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-878/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669892

 hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in 
 build
 ---

 Key: HIVE-8185
 URL: https://issues.apache.org/jira/browse/HIVE-8185
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.14.0
Reporter: Gopal V
Priority: Critical
 Attachments: HIVE-8185.1.patch, HIVE-8185.2.patch


 In the current build, running
 {code}
 jarsigner --verify ./lib/hive-jdbc-0.14.0-SNAPSHOT-standalone.jar
 Jar verification failed.
 {code}
 unless that jar is removed from the lib dir, all hive queries throw the 
 following error 
 {code}
 Exception in thread main java.lang.SecurityException: Invalid signature 
 file digest for Manifest main attributes
   at 
 sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:240)
   at 
 sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:193)
   at java.util.jar.JarVerifier.processEntry(JarVerifier.java:305)
   at java.util.jar.JarVerifier.update(JarVerifier.java:216)
   at java.util.jar.JarFile.initializeVerifier(JarFile.java:345)
   at java.util.jar.JarFile.getInputStream(JarFile.java:412)
   at 
 sun.misc.URLClassPath$JarLoader$2.getInputStream(URLClassPath.java:775)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7482) The execution side changes for SMB join in hive-tez

2014-09-19 Thread Vikram Dixit K (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vikram Dixit K updated HIVE-7482:
-
Status: Open  (was: Patch Available)

 The execution side changes for SMB join in hive-tez
 ---

 Key: HIVE-7482
 URL: https://issues.apache.org/jira/browse/HIVE-7482
 Project: Hive
  Issue Type: Bug
  Components: Tez
Affects Versions: tez-branch
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-7482.1.patch, HIVE-7482.2.patch, HIVE-7482.3.patch, 
 HIVE-7482.4.patch, HIVE-7482.5.patch, HIVE-7482.6.patch, 
 HIVE-7482.WIP.2.patch, HIVE-7482.WIP.3.patch, HIVE-7482.WIP.4.patch, 
 HIVE-7482.WIP.patch


 A piece of HIVE-7430.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8138) Global Init file should allow specifying file name not only directory

2014-09-19 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140959#comment-14140959
 ] 

Szehon Ho commented on HIVE-8138:
-

+1

 Global Init file should allow specifying file name  not only directory
 --

 Key: HIVE-8138
 URL: https://issues.apache.org/jira/browse/HIVE-8138
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8138.patch, HIVE-8138.patch


 HIVE-5160 allows you to specify a directory where a .hiverc file exists. 
 However since .hiverc is a hidden file this can be confusing. The property 
 should allow a path to a file or a directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8096) Fix a few small nits in TestExtendedAcls

2014-09-19 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140966#comment-14140966
 ] 

Szehon Ho commented on HIVE-8096:
-

+1, thanks for the cleanup.

 Fix a few small nits in TestExtendedAcls
 

 Key: HIVE-8096
 URL: https://issues.apache.org/jira/browse/HIVE-8096
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8096.patch






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8138) Global Init file should allow specifying file name not only directory

2014-09-19 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140971#comment-14140971
 ] 

Szehon Ho commented on HIVE-8138:
-

(pending tests)

 Global Init file should allow specifying file name  not only directory
 --

 Key: HIVE-8138
 URL: https://issues.apache.org/jira/browse/HIVE-8138
 Project: Hive
  Issue Type: Bug
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8138.patch, HIVE-8138.patch


 HIVE-5160 allows you to specify a directory where a .hiverc file exists. 
 However since .hiverc is a hidden file this can be confusing. The property 
 should allow a path to a file or a directory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8189) A select statement with a subquery is failing with HBaseSerde

2014-09-19 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8189:
---
Attachment: HIVE-8189.1.patch

Need code review.
The patch is for trunk,   
Problem:
 1) Predicates are used to filter out regions in hbase which do not need to 
 2) The predicates are sticking around in the jobConf from table with predic
Solution:
 removing the predicates before we reset them we remove this bad stat


 A select statement with a subquery is failing with HBaseSerde
 -

 Key: HIVE-8189
 URL: https://issues.apache.org/jira/browse/HIVE-8189
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.12.0, 0.13.1
Reporter: Yongzhi Chen
 Attachments: HIVE-8189.1.patch, hbase_ppd_join.q


 Hive tables in the query are hbase tables, and the subquery is a join 
 statement.
 When
 set hive.optimize.ppd=true;
   and
 set hive.auto.convert.join=false;
 The query does not return data. 
 While hive.optimize.ppd=true and hive.auto.convert.join=true return values 
 back. See attached query file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8189) A select statement with a subquery is failing with HBaseSerde

2014-09-19 Thread Yongzhi Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongzhi Chen updated HIVE-8189:
---
Status: Patch Available  (was: Open)

need code review   

 A select statement with a subquery is failing with HBaseSerde
 -

 Key: HIVE-8189
 URL: https://issues.apache.org/jira/browse/HIVE-8189
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.13.1, 0.12.0
Reporter: Yongzhi Chen
 Attachments: HIVE-8189.1.patch, hbase_ppd_join.q


 Hive tables in the query are hbase tables, and the subquery is a join 
 statement.
 When
 set hive.optimize.ppd=true;
   and
 set hive.auto.convert.join=false;
 The query does not return data. 
 While hive.optimize.ppd=true and hive.auto.convert.join=true return values 
 back. See attached query file. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25716: Type coercion for union queries.

2014-09-19 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25716/
---

(Updated Sept. 19, 2014, 5:55 p.m.)


Review request for hive and John Pullokkaran.


Changes
---

updated per feedback


Bugs: HIVE-8150
https://issues.apache.org/jira/browse/HIVE-8150


Repository: hive-git


Description
---

Type coercion for union queries.


Diffs (updated)
-

  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 0d934ef 

Diff: https://reviews.apache.org/r/25716/diff/


Testing
---

union32.q


Thanks,

Ashutosh Chauhan



[jira] [Commented] (HIVE-6799) HiveServer2 needs to map kerberos name to local name before proxy check

2014-09-19 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-6799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140993#comment-14140993
 ] 

Thejas M Nair commented on HIVE-6799:
-

bq.  and local derby database.
You should be able to use remote rdbms as well.


 HiveServer2 needs to map kerberos name to local name before proxy check
 ---

 Key: HIVE-6799
 URL: https://issues.apache.org/jira/browse/HIVE-6799
 Project: Hive
  Issue Type: Improvement
  Components: HiveServer2
Affects Versions: 0.13.1
Reporter: Dilli Arumugam
Assignee: Dilli Arumugam
 Fix For: 0.14.0

 Attachments: HIVE-6799.1.patch, HIVE-6799.2.patch, HIVE-6799.patch


 HiveServer2 does not map kerberos name of authenticated principal to local 
 name.
 Due to this, I get error like the following in HiveServer log:
 Failed to validate proxy privilage of knox/hdps.example.com for sam
 I have KINITED as knox/hdps.example@example.com
 I do have the following in core-site.xml
   property
 namehadoop.proxyuser.knox.groups/name
 valueusers/value
   /property
   property
 namehadoop.proxyuser.knox.hosts/name
 value*/value
   /property



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25394: HIVE-7503: Support Hive's multi-table insert query with Spark [Spark Branch]

2014-09-19 Thread Chao Sun


 On Sept. 19, 2014, 5:45 p.m., Xuefu Zhang wrote:
  Nice work.
  
  Besides comment below, I think there are some improvement can be done, 
  either here or in a different patch:
  
  1. If we have a module that can compile an op tree (given by top ops) into 
  a spark task, then we can reuse it after the original op tree is broken 
  into several trees. From each tree, we compile it into a spark task. In the 
  end, we hook up parent child relation ship. The current logic is a little 
  complicated and hard to understand.
  2. Tests 
  3. Optimizations

I agree. I can do these in separate following patches.


 On Sept. 19, 2014, 5:45 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java,
   line 142
  https://reviews.apache.org/r/25394/diff/3/?file=693788#file693788line142
 
  Here we are mapping the children of lca to lca itself. Why is this 
  necessary, as you can find the chidren of lca later without the map. Cannot 
  we just store lca here?

The problem is because we are only generating one FS but multiple TSs. After 
the FS and the first TS is generated, the relation between child-parent is lost 
(since the optree is modified), and hence we need to store this information 
somewhere else, to be used when process the rest TSs.


 On Sept. 19, 2014, 5:45 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java,
   line 140
  https://reviews.apache.org/r/25394/diff/3/?file=693788#file693788line140
 
  This seems covering only the case where all FSs have a commont FORWARD 
  parent. What if only some of them sharing a FORWARD parent, but other FSs 
  and the FORWARD operator sharing some common parent?
  
  I think the rule for whether to break the plan goes like this:
  
  A plan needs to be broken if and only if there are more than one 
  FileSinkOperator that can be traced back to a common parent and the tracing 
  has to pass a ReduceSinkOperator on the way.

In this case the LCA is not a FOR, then break at this point is safe (might not 
be optimal), is that right?
Personally, after so many attempts, I'm a bit inclined to just do what MR does: 
go top-down and keep the first RS in the same SparkWork. For the rests RS, just 
break the plan.


 On Sept. 19, 2014, 5:45 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java,
   line 120
  https://reviews.apache.org/r/25394/diff/3/?file=693788#file693788line120
 
  I feel that the logic here can be simplified. Could we just pop all 
  paths and then check if the root is the same and keep doing so until the 
  common parent is found?

I'm not quite sure. I would happily accept if you have a better algorithm :) 
(the one I'm using is a just standard algorithm for finding LCA).
The LCA could be at different place in each path. How do you proceed to pop all 
paths? Also, there could be multiple common parents, but we need to identify 
the lowest one.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25394/#review53871
---


On Sept. 18, 2014, 6:38 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25394/
 ---
 
 (Updated Sept. 18, 2014, 6:38 p.m.)
 
 
 Review request for hive, Brock Noland and Xuefu Zhang.
 
 
 Bugs: HIVE-7503
 https://issues.apache.org/jira/browse/HIVE-7503
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 For Hive's multi insert query 
 (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML), there 
 may be an MR job for each insert. When we achieve this with Spark, it would 
 be nice if all the inserts can happen concurrently.
 It seems that this functionality isn't available in Spark. To make things 
 worse, the source of the insert may be re-computed unless it's staged. Even 
 with this, the inserts will happen sequentially, making the performance 
 suffer.
 This task is to find out what takes in Spark to enable this without requiring 
 staging the source and sequential insertion. If this has to be solved in 
 Hive, find out an optimum way to do this.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
 4211a0703f5b6bfd8a628b13864fac75ef4977cf 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
 695d8b90cb1989805a7ff4e39a9635bbcea9c66c 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java 
 864965e03a3f9d665e21e1c1b10b19dc286b842f 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 76fc290f00430dbc34dbbc1a0cef0d0eb59e6029 
   
 

[jira] [Updated] (HIVE-7883) DBTxnManager trying to close already closed metastore client connection

2014-09-19 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-7883:
-
Attachment: HIVE-7883.patch

The real question is why is DbTxnManager creating its own HiveMetastoreClient 
rather than using the existing one.  The attached patch fixes that, and removes 
the close.

 DBTxnManager trying to close already closed metastore client connection
 ---

 Key: HIVE-7883
 URL: https://issues.apache.org/jira/browse/HIVE-7883
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Transactions
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Alan Gates
 Attachments: HIVE-7883.patch


 You will find following log message :
 {code}
 ERROR hive.metastore: Unable to shutdown local metastore client
 org.apache.thrift.transport.TTransportException: Cannot write to null 
 outputStream
at 
 org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
at 
 org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163)
at 
 org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:91)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
at 
 com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:431)
at 
 com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:425)
at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:435)
at 
 org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:304)
at 
 org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.finalize(HiveTxnManagerImpl.java:44)
at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:101)
at java.lang.ref.Finalizer.access$100(Finalizer.java:32)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:190)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7883) DBTxnManager trying to close already closed metastore client connection

2014-09-19 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-7883:
-
Status: Patch Available  (was: Open)

 DBTxnManager trying to close already closed metastore client connection
 ---

 Key: HIVE-7883
 URL: https://issues.apache.org/jira/browse/HIVE-7883
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Transactions
Affects Versions: 0.14.0
Reporter: Mostafa Mokhtar
Assignee: Alan Gates
 Attachments: HIVE-7883.patch


 You will find following log message :
 {code}
 ERROR hive.metastore: Unable to shutdown local metastore client
 org.apache.thrift.transport.TTransportException: Cannot write to null 
 outputStream
at 
 org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
at 
 org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163)
at 
 org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:91)
at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62)
at 
 com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:431)
at 
 com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:425)
at 
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:435)
at 
 org.apache.hadoop.hive.ql.lockmgr.DbTxnManager.destruct(DbTxnManager.java:304)
at 
 org.apache.hadoop.hive.ql.lockmgr.HiveTxnManagerImpl.finalize(HiveTxnManagerImpl.java:44)
at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:101)
at java.lang.ref.Finalizer.access$100(Finalizer.java:32)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:190)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25716: Type coercion for union queries.

2014-09-19 Thread John Pullokkaran

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25716/#review53979
---

Ship it!


Ship It!

- John Pullokkaran


On Sept. 19, 2014, 5:55 p.m., Ashutosh Chauhan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25716/
 ---
 
 (Updated Sept. 19, 2014, 5:55 p.m.)
 
 
 Review request for hive and John Pullokkaran.
 
 
 Bugs: HIVE-8150
 https://issues.apache.org/jira/browse/HIVE-8150
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 Type coercion for union queries.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 0d934ef 
 
 Diff: https://reviews.apache.org/r/25716/diff/
 
 
 Testing
 ---
 
 union32.q
 
 
 Thanks,
 
 Ashutosh Chauhan
 




Re: Review Request 25754: HIVE-8111 CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

2014-09-19 Thread John Pullokkaran

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25754/#review53982
---



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java
https://reviews.apache.org/r/25754/#comment93860

Avoid CBO name in function. Its a generic function current consumer is CBO.


- John Pullokkaran


On Sept. 17, 2014, 9:25 p.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25754/
 ---
 
 (Updated Sept. 17, 2014, 9:25 p.m.)
 
 
 Review request for hive, Ashutosh Chauhan and John Pullokkaran.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/RexNodeConverter.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java 
 6131d3d 
   ql/src/test/queries/clientpositive/decimal_udf.q 591c210 
   ql/src/test/results/clientpositive/decimal_udf.q.out c5c2031 
 
 Diff: https://reviews.apache.org/r/25754/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




Re: Review Request 25394: HIVE-7503: Support Hive's multi-table insert query with Spark [Spark Branch]

2014-09-19 Thread Chao Sun


 On Sept. 19, 2014, 5:45 p.m., Xuefu Zhang wrote:
  ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java,
   line 142
  https://reviews.apache.org/r/25394/diff/3/?file=693788#file693788line142
 
  Here we are mapping the children of lca to lca itself. Why is this 
  necessary, as you can find the chidren of lca later without the map. Cannot 
  we just store lca here?
 
 Chao Sun wrote:
 The problem is because we are only generating one FS but multiple TSs. 
 After the FS and the first TS is generated, the relation between child-parent 
 is lost (since the optree is modified), and hence we need to store this 
 information somewhere else, to be used when process the rest TSs.

It might be tricky to just store LCA. When the graph walker reaches a node, it 
needs to check whether that node is a child of LCA, and if so, break the plan.
You could say that since we have LCA, we have all its children info. However, 
after the first child, the children for the LCA are changed, so we need to 
store this info somewhere, IMHO.


- Chao


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25394/#review53871
---


On Sept. 18, 2014, 6:38 p.m., Chao Sun wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25394/
 ---
 
 (Updated Sept. 18, 2014, 6:38 p.m.)
 
 
 Review request for hive, Brock Noland and Xuefu Zhang.
 
 
 Bugs: HIVE-7503
 https://issues.apache.org/jira/browse/HIVE-7503
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 For Hive's multi insert query 
 (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML), there 
 may be an MR job for each insert. When we achieve this with Spark, it would 
 be nice if all the inserts can happen concurrently.
 It seems that this functionality isn't available in Spark. To make things 
 worse, the source of the insert may be re-computed unless it's staged. Even 
 with this, the inserts will happen sequentially, making the performance 
 suffer.
 This task is to find out what takes in Spark to enable this without requiring 
 staging the source and sequential insertion. If this has to be solved in 
 Hive, find out an optimum way to do this.
 
 
 Diffs
 -
 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkProcContext.java 
 4211a0703f5b6bfd8a628b13864fac75ef4977cf 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkUtils.java 
 695d8b90cb1989805a7ff4e39a9635bbcea9c66c 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/GenSparkWork.java 
 864965e03a3f9d665e21e1c1b10b19dc286b842f 
   ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkCompiler.java 
 76fc290f00430dbc34dbbc1a0cef0d0eb59e6029 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMergeTaskProcessor.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkMultiInsertionProcessor.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkProcessAnalyzeTable.java
  5fcaf643a0e90fc4acc21187f6d78cefdb1b691a 
   
 ql/src/java/org/apache/hadoop/hive/ql/parse/spark/SparkTableScanProcessor.java
  PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/25394/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Chao Sun
 




[jira] [Updated] (HIVE-7812) Disable CombineHiveInputFormat when ACID format is used

2014-09-19 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-7812:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I committed this. Thanks for the review, Ashutosh.

 Disable CombineHiveInputFormat when ACID format is used
 ---

 Key: HIVE-7812
 URL: https://issues.apache.org/jira/browse/HIVE-7812
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.14.0

 Attachments: HIVE-7812.patch, HIVE-7812.patch, HIVE-7812.patch, 
 HIVE-7812.patch


 Currently the HiveCombineInputFormat complains when called on an ACID 
 directory. Modify HiveCombineInputFormat so that HiveInputFormat is used 
 instead if the directory is ACID format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-7145) Remove dependence on apache commons-lang

2014-09-19 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-7145:

Assignee: (was: Owen O'Malley)

 Remove dependence on apache commons-lang
 

 Key: HIVE-7145
 URL: https://issues.apache.org/jira/browse/HIVE-7145
 Project: Hive
  Issue Type: Bug
Reporter: Owen O'Malley

 We currently depend on both Apache commons-lang and commons-lang3. They are 
 the same project, just at version 2.x vs 3.x. I propose that we move all of 
 the references in Hive to commons-lang3 and remove the v2 usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25754: HIVE-8111 CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

2014-09-19 Thread John Pullokkaran

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25754/#review53986
---



ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java
https://reviews.apache.org/r/25754/#comment93862

Instead of these changes why don't you use 
FunctionRegistry.getTypeInfoForPrimitiveCategory(a,b)


- John Pullokkaran


On Sept. 17, 2014, 9:25 p.m., Sergey Shelukhin wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/25754/
 ---
 
 (Updated Sept. 17, 2014, 9:25 p.m.)
 
 
 Review request for hive, Ashutosh Chauhan and John Pullokkaran.
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 see jira
 
 
 Diffs
 -
 
   
 ql/src/java/org/apache/hadoop/hive/ql/optimizer/optiq/translator/RexNodeConverter.java
  PRE-CREATION 
   
 ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFBaseNumeric.java 
 6131d3d 
   ql/src/test/queries/clientpositive/decimal_udf.q 591c210 
   ql/src/test/results/clientpositive/decimal_udf.q.out c5c2031 
 
 Diff: https://reviews.apache.org/r/25754/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Sergey Shelukhin
 




[jira] [Commented] (HIVE-8111) CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO

2014-09-19 Thread Laljo John Pullokkaran (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141065#comment-14141065
 ] 

Laljo John Pullokkaran commented on HIVE-8111:
--

Why don't use FunctionRegistry.getTypeInfoForPrimitiveCategory() to decide 
common type; may be add a utility to find common type among n args.

 CBO trunk merge: duplicated casts for arithmetic expressions in Hive and CBO
 

 Key: HIVE-8111
 URL: https://issues.apache.org/jira/browse/HIVE-8111
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-8111.patch


 Original test failure: looks like column type changes to different decimals 
 in most cases. In one case it causes the integer part to be too big to fit, 
 so the result becomes null it seems.
 What happens is that CBO adds casts to arithmetic expressions to make them 
 type compatible; these casts become part of new AST, and then Hive adds casts 
 on top of these casts. This (the first part) also causes lots of out file 
 changes. It's not clear how to best fix it so far, in addition to incorrect 
 decimal width and sometimes nulls when width is larger than allowed in Hive.
 Option one - don't add those for numeric ops - cannot be done if numeric op 
 is a part of compare, for which CBO needs correct types.
 Option two - unwrap casts when determining type in Hive - hard or impossible 
 to tell apart CBO-added casts and user casts. 
 Option three - don't change types in Hive if CBO has run - seems hacky and 
 hard to ensure it's applied everywhere.
 Option four - map all expressions precisely between two trees and remove 
 casts again after optimization, will be pretty difficult.
 Option five - somehow mark those casts. Not sure about how yet.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Attachment: HIVE-8184.2.patch

fix null pointer exception

  inconsistence between colList and columnExprMap when ConstantPropagate is 
 applied to subquery
 --

 Key: HIVE-8184
 URL: https://issues.apache.org/jira/browse/HIVE-8184
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-8184.1.patch, HIVE-8184.2.patch


 Query like 
  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
 from src a join src1 b where a.key = '428' ) c;
 will fail as
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Status: Open  (was: Patch Available)

  inconsistence between colList and columnExprMap when ConstantPropagate is 
 applied to subquery
 --

 Key: HIVE-8184
 URL: https://issues.apache.org/jira/browse/HIVE-8184
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-8184.1.patch, HIVE-8184.2.patch


 Query like 
  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
 from src a join src1 b where a.key = '428' ) c;
 will fail as
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Status: Patch Available  (was: Open)

  inconsistence between colList and columnExprMap when ConstantPropagate is 
 applied to subquery
 --

 Key: HIVE-8184
 URL: https://issues.apache.org/jira/browse/HIVE-8184
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-8184.1.patch, HIVE-8184.2.patch


 Query like 
  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
 from src a join src1 b where a.key = '428' ) c;
 will fail as
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Attachment: (was: HIVE-8184.2.patch)

  inconsistence between colList and columnExprMap when ConstantPropagate is 
 applied to subquery
 --

 Key: HIVE-8184
 URL: https://issues.apache.org/jira/browse/HIVE-8184
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-8184.1.patch


 Query like 
  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
 from src a join src1 b where a.key = '428' ) c;
 will fail as
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6936) Provide table properties to InputFormats

2014-09-19 Thread Owen O'Malley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Owen O'Malley updated HIVE-6936:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

I messed up and committed both HIVE-7812 and HIVE-6936 (with one change) in 
r1626292. The last part of HIVE-6936 is r1626294.

 Provide table properties to InputFormats
 

 Key: HIVE-6936
 URL: https://issues.apache.org/jira/browse/HIVE-6936
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Reporter: Owen O'Malley
Assignee: Owen O'Malley
 Fix For: 0.14.0

 Attachments: HIVE-6936.patch, HIVE-6936.patch, HIVE-6936.patch, 
 HIVE-6936.patch, HIVE-6936.patch, HIVE-6936.patch, HIVE-6936.patch, 
 HIVE-6936.patch, HIVE-6936.patch, HIVE-6936.patch


 Some advanced file formats need the table properties made available to them. 
 Additionally, it would be convenient to provide a unique id for fetch 
 operators and the complete list of directories.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HIVE-8191) Update and delete on tables with non Acid output formats gives runtime error

2014-09-19 Thread Alan Gates (JIRA)
Alan Gates created HIVE-8191:


 Summary: Update and delete on tables with non Acid output formats 
gives runtime error
 Key: HIVE-8191
 URL: https://issues.apache.org/jira/browse/HIVE-8191
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical


{code}
create table not_an_acid_table(a int, b varchar(128));
insert into table not_an_acid_table select cint, cast(cstring1 as varchar(128)) 
from alltypesorc where cint is not null order by cint limit 10;
delete from not_an_acid_table where b = '0ruyd6Y50JpdGRf6HqD';
{code}

This generates a runtime error.  It should get a compile error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8150) [CBO] Type coercion in union queries

2014-09-19 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-8150:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to cbo branch.

 [CBO] Type coercion in union queries
 

 Key: HIVE-8150
 URL: https://issues.apache.org/jira/browse/HIVE-8150
 Project: Hive
  Issue Type: Bug
  Components: CBO
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-8150.cbo.patch, HIVE-8150.cbo.patch


 If we can't get common type from Optiq, bail out for now.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8100) Add QTEST_LEAVE_FILES to QTestUtil

2014-09-19 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-8100:
---
Description: Basically the idea here is to have an option to not delete the 
warehouse directory. I am using an env variable so it's always passed to all 
sub-processes. This is useful when you want to see the table structure of the 
warehouse directory after a test.  (was: Basically the idea here is to have an 
option to not delete the warehouse directory. I am using an env variable so 
it's always passed to all sub-processes.)

 Add QTEST_LEAVE_FILES to QTestUtil
 --

 Key: HIVE-8100
 URL: https://issues.apache.org/jira/browse/HIVE-8100
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8100.patch, HIVE-8100.patch


 Basically the idea here is to have an option to not delete the warehouse 
 directory. I am using an env variable so it's always passed to all 
 sub-processes. This is useful when you want to see the table structure of the 
 warehouse directory after a test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Status: Open  (was: Patch Available)

  inconsistence between colList and columnExprMap when ConstantPropagate is 
 applied to subquery
 --

 Key: HIVE-8184
 URL: https://issues.apache.org/jira/browse/HIVE-8184
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-8184.1.patch


 Query like 
  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
 from src a join src1 b where a.key = '428' ) c;
 will fail as
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8100) Add QTEST_LEAVE_FILES to QTestUtil

2014-09-19 Thread Szehon Ho (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141085#comment-14141085
 ] 

Szehon Ho commented on HIVE-8100:
-

+1

 Add QTEST_LEAVE_FILES to QTestUtil
 --

 Key: HIVE-8100
 URL: https://issues.apache.org/jira/browse/HIVE-8100
 Project: Hive
  Issue Type: Improvement
Reporter: Brock Noland
Assignee: Brock Noland
 Attachments: HIVE-8100.patch, HIVE-8100.patch


 Basically the idea here is to have an option to not delete the warehouse 
 directory. I am using an env variable so it's always passed to all 
 sub-processes. This is useful when you want to see the table structure of the 
 warehouse directory after a test.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Status: Patch Available  (was: Open)

  inconsistence between colList and columnExprMap when ConstantPropagate is 
 applied to subquery
 --

 Key: HIVE-8184
 URL: https://issues.apache.org/jira/browse/HIVE-8184
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-8184.1.patch, HIVE-8184.2.patch


 Query like 
  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
 from src a join src1 b where a.key = '428' ) c;
 will fail as
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8184) inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread Pengcheng Xiong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pengcheng Xiong updated HIVE-8184:
--
Attachment: HIVE-8184.2.patch

  inconsistence between colList and columnExprMap when ConstantPropagate is 
 applied to subquery
 --

 Key: HIVE-8184
 URL: https://issues.apache.org/jira/browse/HIVE-8184
 Project: Hive
  Issue Type: Improvement
Reporter: Pengcheng Xiong
Priority: Minor
 Attachments: HIVE-8184.1.patch, HIVE-8184.2.patch


 Query like 
  select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
 from src a join src1 b where a.key = '428' ) c;
 will fail as
 FAILED: Execution Error, return code 2 from 
 org.apache.hadoop.hive.ql.exec.mr.MapRedTask



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 25800: inconsistence between colList and columnExprMap when ConstantPropagate is applied to subquery

2014-09-19 Thread pengcheng xiong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/25800/
---

(Updated Sept. 19, 2014, 6:55 p.m.)


Review request for hive.


Changes
---

address null pointer exception


Repository: hive-git


Description
---

Query like
select * from (select a.key as ak, a.value as av, b.key as bk, b.value as bv 
from src a join src1 b where a.key = '428' ) c;
will fail as
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.mr.MapRedTask


Diffs (updated)
-

  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConstantPropagateProcFactory.java
 790a92e 
  ql/src/test/queries/clientpositive/constantPropagateForSubQuery.q 
PRE-CREATION 
  ql/src/test/results/clientpositive/constantPropagateForSubQuery.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/25800/diff/


Testing
---


Thanks,

pengcheng xiong



[jira] [Assigned] (HIVE-7856) Enable parallelism in Reduce Side Join [Spark Branch]

2014-09-19 Thread Szehon Ho (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Szehon Ho reassigned HIVE-7856:
---

Assignee: Szehon Ho

 Enable parallelism in Reduce Side Join [Spark Branch]
 -

 Key: HIVE-7856
 URL: https://issues.apache.org/jira/browse/HIVE-7856
 Project: Hive
  Issue Type: New Feature
  Components: Spark
Reporter: Szehon Ho
Assignee: Szehon Ho

 This is dependent on new transformation to be provided by SPARK-2978, see 
 parent JIRA for details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop

2014-09-19 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141095#comment-14141095
 ] 

Gopal V commented on HIVE-8188:
---

[~prasanth_j]: that is pretty neat, speedup.

But that's not the place I found the fix in, it was in isDeterministic() within 
the Constant codepath in ExprNode evaluator.

 ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
 loop
 -

 Key: HIVE-8188
 URL: https://issues.apache.org/jira/browse/HIVE-8188
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.14.0
Reporter: Gopal V
 Attachments: udf-deterministic.png


 When running a near-constant UDF, most of the CPU is burnt within the VM 
 trying to read the class annotations for every row.
 !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8188) ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight loop

2014-09-19 Thread Prasanth J (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141099#comment-14141099
 ] 

Prasanth J commented on HIVE-8188:
--

Looking at the attached PNG (GBY + Reflection), I thought its UDAF that uses 
reflection in inner loop. Looks like many places needs improvement then. 

 ExprNodeGenericFuncEvaluator::_evaluate() loads class annotations in a tight 
 loop
 -

 Key: HIVE-8188
 URL: https://issues.apache.org/jira/browse/HIVE-8188
 Project: Hive
  Issue Type: Bug
  Components: UDF
Affects Versions: 0.14.0
Reporter: Gopal V
 Attachments: udf-deterministic.png


 When running a near-constant UDF, most of the CPU is burnt within the VM 
 trying to read the class annotations for every row.
 !udf-deterministic.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8186) CBO Trunk Merge: join_vc fails

2014-09-19 Thread Hive QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141113#comment-14141113
 ] 

Hive QA commented on HIVE-8186:
---



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12669886/HIVE-8186.patch

{color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 6293 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_vc
org.apache.hadoop.hive.ql.parse.TestParse.testParse_union
{noformat}

Test results: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/879/testReport
Console output: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/879/console
Test logs: 
http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-879/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 2 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12669886

 CBO Trunk Merge: join_vc fails
 --

 Key: HIVE-8186
 URL: https://issues.apache.org/jira/browse/HIVE-8186
 Project: Hive
  Issue Type: Sub-task
  Components: CBO
Reporter: Sergey Shelukhin
Assignee: Sergey Shelukhin
 Attachments: HIVE-8186.patch


 Simplified query appears to fail in CBO branch even with CBO disabled. I'm 
 looking...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-8185) hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in build

2014-09-19 Thread Gopal V (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14141130#comment-14141130
 ] 

Gopal V commented on HIVE-8185:
---

+1 - LGTM.

 hive-jdbc-0.14.0-SNAPSHOT-standalone.jar fails verification for signatures in 
 build
 ---

 Key: HIVE-8185
 URL: https://issues.apache.org/jira/browse/HIVE-8185
 Project: Hive
  Issue Type: Bug
  Components: JDBC
Affects Versions: 0.14.0
Reporter: Gopal V
Priority: Critical
 Attachments: HIVE-8185.1.patch, HIVE-8185.2.patch


 In the current build, running
 {code}
 jarsigner --verify ./lib/hive-jdbc-0.14.0-SNAPSHOT-standalone.jar
 Jar verification failed.
 {code}
 unless that jar is removed from the lib dir, all hive queries throw the 
 following error 
 {code}
 Exception in thread main java.lang.SecurityException: Invalid signature 
 file digest for Manifest main attributes
   at 
 sun.security.util.SignatureFileVerifier.processImpl(SignatureFileVerifier.java:240)
   at 
 sun.security.util.SignatureFileVerifier.process(SignatureFileVerifier.java:193)
   at java.util.jar.JarVerifier.processEntry(JarVerifier.java:305)
   at java.util.jar.JarVerifier.update(JarVerifier.java:216)
   at java.util.jar.JarFile.initializeVerifier(JarFile.java:345)
   at java.util.jar.JarFile.getInputStream(JarFile.java:412)
   at 
 sun.misc.URLClassPath$JarLoader$2.getInputStream(URLClassPath.java:775)
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-8191) Update and delete on tables with non Acid output formats gives runtime error

2014-09-19 Thread Alan Gates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-8191:
-
Attachment: HIVE-8191.patch

Added a check when updating and deleting that the table is acid compliant.

 Update and delete on tables with non Acid output formats gives runtime error
 

 Key: HIVE-8191
 URL: https://issues.apache.org/jira/browse/HIVE-8191
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.14.0
Reporter: Alan Gates
Assignee: Alan Gates
Priority: Critical
 Attachments: HIVE-8191.patch


 {code}
 create table not_an_acid_table(a int, b varchar(128));
 insert into table not_an_acid_table select cint, cast(cstring1 as 
 varchar(128)) from alltypesorc where cint is not null order by cint limit 10;
 delete from not_an_acid_table where b = '0ruyd6Y50JpdGRf6HqD';
 {code}
 This generates a runtime error.  It should get a compile error instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >